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MEMBERSHIP OF THE AMERICAN STATISTICAL 
ASSOCIATION ON ITS HUNDREDTH ANNIVERSARY 


By Ricuarp L. FunKHovuseEr, Secretary 
American Statistical Association 


CCORDING to the most recent available figures, about one-half of 
A the Fellows and Regular Members of the American Statistical 
Association are locaied in the two largest chapters, somewhat over one- 
half find that their major interest in statistical data and methods lies in 
economics or business, and nearly a fourth report that the preparation 
of questionnaires and record forms is one of the statistical methods of 
principal importance to them. 

These are among the conclusions to be drawn from an analysis of 
the returns furnished by the Fellows and Regular Members during the 
spring of 1940 on a special directory form used in compiling the Cen- 
tenary Membership Directory.' This article presents some of the tabu- 
lations compiled from the data, the information having been placed on 
punch cards and the tabulations made through the courtesy of a promi- 
nent member and former officer of the Association.? 

Of the 2 644 members of the Association for whom data were com- 
piled,* returns were received from all except 461; some information on 
the latter was derived from the regular mailing list. With the exception 
of their geographic distribution,‘ no attempt has been made to deter- 
mine whether those furnishing information were representative of the 
entire membership. It is probable that some biases were introduced by 


1 This JourNAL, Vol. 35, No. 210, Part 2, June, 1940. 

2 The member in question requested that he be permitted to remain anonymous. In behalf of the 
officers and members of the Association, we take this occasion to acknowledge our indebtedness to him 
for his contribution and to express our sincere appreciation. 

* The data constituting the basis of this article differ substantially from those in three notes pub- 
lished a decade or more ago relating to the composition of the membersi::p of the American Statistical 
Association. The first was “Classifications of Members of American Statistical Association on Basis 
of Duties and Interests,” by Willford I. King, then Secretary of the Association; see this JouRNAL, 
Vol. 22, June, 1927, pp. 224-226. The other two were by Stuart A. Rice and Morris Green: “Interlock- 
ing Memberships of Social Science Societies,” this JournaL, Vol. 24, September, 1929, pp. 303-306; 
and “Composition of the American Statistical Association,” this JounnaL, Vol. 25, June, 1930, pp. 
198-202. 

4 In the localities cited below, the proportions failing to reply to the questionnaire were somewhat 
higher than the average for the entire membership: Middle Atlantic states, in which the New York 
District Chapter is situated; South Atlantic states, where most of the members were in the Washington 
Chapter area; and the foreign countries. 
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the fact that about a sixth of the members failed to send in their com- 
pleted returns, but it seems hazardous to estimate the nature or extent 
of the biases. 


DISTRIBUTION OF MEMBERSHIP BY TYPES OF ORGANIZATIONS 


About 29 per cent of the members from whom reports were received 
were connected with a federal government agency, and another nine per 
cent with a state or local government agency, making a total of 38 per 
cent engaged in government work. About 23 per cent of the members 
reporting were connected with a college or university, and 21 per cent 
were employed in one of the branches of business specifically listed on 
the form. Sixteen per cent were associated with other businesses or 
professions, and somewhat less than two per cent were retired. The 
proportions, together with the numbers of members in each category, 
are shown in Table I.5 

TABLE I 


MEMBERSHIP OF THE AMERICAN STATISTICAL ASSOCIATION, BY TYPES OF 
ORGANIZATIONS WITH WHICH CONNECTED 











Type of organization Number of members Percentage of total number 
of members returning report 
College or university 509 23.4 
Local government agency 36 1.6 
State government agency 159 7.3 
Federal government agency 634 29.1 
Financial institution or agency 228 10.4 
Public utility company 86 3.9 
Manufacturing industry 113 5.2 
Retail or wholesale trade concern 33 1.5 
Other businesses and professions 350 16.0 
Retired 35 1.6 
Total 2,183 100.0 
Unknown (No form received) 461 
Grand total 2,644 











CLUSTERS OF SUBJECT-MATTER INTERESTS 


Nearly one-fourth of the members indicated that their major interest 
in statistical data was in economics, and another fourth reported their 
major statistical interest in subjects falling in the various fields of 
business. The remaining half of the membership designated fields scat- 
tered over a wide range of interests. Outside of economics and business, 
the largest representation was among the sociological subjects. 


5 The distribution presented here is not comparable with that along similar lines prepared under 
Professor King’s direction in 1927; see op. cit., p. 225. 
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These facts are among those gleaned from the answers to the follow- 
ing question on the directory form: 


In what field does your interest in statistical data and methods 
center? (Underline the principal field: check not more than three 
others.) 


There followed a list of thirty-one specific fields of statistical interests, 
and a line for inserting additional fields if desired. Of the 2,183 members 
who returned the form, 261 designated no field of principal interest and 
98 reported a combination of fields. The analysis that follows is based 
on the remaining 1,824 reports. 

The two tables that follow present the relevant data considered in 
this section. Table II shows the distribution of the principal interests 
of the members, in absolute and in percentage terms.* In Table III 
are the numbers of members, by broad groups reflecting their principal 
interests, together with the percentages of the total number of mem- 
bers in each broad group who checked each subject as a subsidiary 
interest. 

The indications of principal interests hide m:ny important and inter- 
esting features concerning the clusters and varieties of subject-matter 
interests that characterized those who were members of the Associa- 
tion. This section is devoted to a consideration of these attributes. 

Economics. This broad field is composed of six subjects which seemed 
to fall more or less definitely together. The first three are comprehen- 
sive: general economic research, general indexes of economic activity, and 
price indexes and price analysis.’ The other three are more specific: 
labor statistics, wages and incomes, and taxes and public finance. 

In this field, as in some discussed below, it was necessary to make 
somewhat arbitrary decisions concerning the inclusion or exclusion of 
specific subjects. For example, labor statistics and wages and incomes, 
included in economics, also have many social implications; the subject 
of unemployment, classified in the sociological field, has many economic 
phases. 

A total of 533 members indicated a major interest in one of the fields 
classified as economics. Of these, about half designated general economic 
research as their principal interest, about one-seventh underlined gen- 
eral indexes of economic activity and an equal proportion labor statistics ; 
eight per cent specified price indexes and price analysis; the remainder 
designated wages and incomes or taxes and public finance. 

¢ A preliminary summary of the distribution of principal interests was presented last winter in the 
American Statistical Association Bulletin, Vol. 1, No. 12, December, 1940, pp. 108-109. 


7 The exact or abbreviated titles of the fields, as listed on the form, are here presented in italics; 
this practice is followed throughout the paper. 
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TABLE II 


MEMBERSHIP OF THE AMERICAN STATISTICAL ASSOCIATION, BY FIELDS OF 
PRINCIPAL {NTEREST IN STATISTICAL DATA AND METHODS 
































Field of principal interest in statistical Number of a of —_ j 
data and methods members Picea anne ans 
' 
Economic: | 
General economic research 270 | 14.8 
General indexes of economic activity 77 4.2 
Price indexes and price analysis 42 | 2.3 
Labor statistics 76 4.2 
Wages and incomes 30 1.6 
Taxes and public finance 38 | 2.1 
Sub-total, economic | 683 29.2 
| 
Non-financial business: | 
Manufacturing production, employment, etc. 50 2.7 
Wholesale and retail trade | 27 1.5 
International trade 14 | 0.8 
Market research and advertising } 53 2.9 
Business management | 7 2.6 | 
Real estate and construction | 30 1.6 
Public utilities, including transportation 60 | 3.3 
Accounting 34 1.9 
Sub-total, non-financial business 316 17.3 
! 
Financial business: 
Investments and security markets 113 6.2 
Banking and credit 52 | 2.9 
Insurance 69 | 3.8 
Sub-total, financial business 234 12.8 
| | 
Sociological: | 
Population research and vital statistics 113 H 6.2 
Public health research and administration 57 3.1 
Unemployment 51 2.8 
Sociological research 54 3.0 
Relief and welfare services 115 6.3 
Sub-total, sociological } 890 21.4 
| 
Psychology and education: 
Psychological research | 33 1.8 
Personnel and vocational guidance 15 0.8 
Educational research | 33 1.8 
Educational administration 3 0.2 
Sub-total, psychology and education 84 4.6 
Other: | 
Statistics of agriculture and rural life | 77 4.2 
Control of industrial processes 9 0.5 ) 
Agricultural experimentation 13 | 0.7 
Biological research | 23 1.3 } 
Research in mathematical statistics 52 2.9 
Other 94 | 5.2 
Sub-total, other 268 14.7 
| | a 
Sub-total 1,824 1,824 | 100.0 100.0 
Combinations and no field specified | 359 359 | 
agen } 
Total 2,183 2,183 | 
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TABLE III 
MEMBERSHIP OF THE AMERICAN STATISTICAL ASSOCIATION, BY SUBSIDIARY 
INTERESTS AND BY BROAD GROUPS OF PRINCIPAL INTEREST IN STATISTICAL 
DATA AND METHODS* 


Broad groups of principal interest 
































Fields of subsidiary interest in | Non- | Finan- | | Psy- | | 
statistical data and methods Eco- | financial cial | Socio- | chology - 
nomic busi- | busi- logical | and edu- Other | Total 
ness | ness | | cation | 
Number of membersdesignating prin- | 
cipal interest 533 | 315 234 | 390 | 84 | 268 | 1,824 
Percéntage of members in broad group | 
who designated subsidiary interest) | | 
in: | 
| | 
Economic: | | 
General economic research 16.9 37 .2 40.6 14.4 | 3.6 | 24.3 | 23.4 
General indexes of economic activity; 28.5 29.3 32.9 6.7 1.2 | 9.7 | 20.5 
Price indexes and price analysis 20.6 14.6 12.4 156); — | 20.6 | 13.5 
Labor statistics 8.4 7.9 2.6 | 17.4 6.0 | 4.1 8.8 
Wages and incomes 9.5 8.9 3.4 | 15.6 | 2.4 | 4.6] 12.1 
Taxes and public finance 5.8 8.9 9.0 3.2 2.4 | 4.1 | 5.8 
Non-financial business: | | 
Manufacturing production, employ- | 
ment, ete. 26.1 10.8 3.7 | 7.4 _— 9.7 | 14.3 
Wholesale and retail trade 8.1 10.5 17} 13), — 2.61 5.0 
International trade 5.6 3.5 5.1 | — | — | 5.2] 3.7 
Market research and advertising 8.3 17.2 13 | 1.0 | 48 | 9.3] 7.3 
Business management 7.9 16.5 9.0 1.8 | 3.6 3.4 7.3 
Real estate and construction 4.9 2.2 SK 1.0 | 1.2 | 1.9) 2.6 
Public utilities, including transpor- i | | 
tation 3.8 3.2 5.6 0.5 | 1.2 | 2.6 2.9 
Acoounting 5.6 15.6 14.1 4.1 | a mt 21 78 
| | 
Financial business: } | 
Investments and security markets 11.4 9.2 19.7 | 1.0 | 1.2 | 4] 8.2 
Banking and credit 10.9 6.4 24.4 —_— | — | 3.4 7.9 
Insurance 1.9 3.5 3.4 3.1 | 1.2 6.3 | 3.2 
| 
Sociological: 
Population research and vital statis- 
tics 6.0 8.6 11.5 26.9 14.3 19.0 | 13.9 
Public health research and adminis- | 
tration 1.7 1.0 3.8 20.8 3.6 3.4) 6.2 
Unemployment 16.3 1.0 2.6 23.1 7.2 6.7 | 11.5 
Sociological research 2.3 1.6 1.7 27.4 25.0 7.5 | 9.3 
Relief and welfare services .4 1.0 0.4 11.5 6.0 6.0 | 5.7 
Psychology and education: | 
Psychological research 0.9 1.9 0. 6.1 42.9 3.4 | 4.5 
Personnel and vocational guidance 1.9 1.3 0.9 5.6 | 47.6 3.4) 4.8 
Educational research 1.3 1.6 1.3 2.1 | 41 Ry 3.0} 3.6 
Educational administration 0.4 1.6 0.4 0.5 19.1 | 1.1 1.6 
Other: 
Statistics of agriculture and rural life 7.3 6.7 2.1 6.7 2.4 5.2 5.9 
Control of industrial processes 2.6 5.7 0.4 0.5 2.4 | 4.9 2.7 
Agricultural experimentation 0.4 1.0 0.4 0.5 — | 10.8 2.0 
Biological research 0.6 0.6 0.4 7.4] 13.3 | 36 3.2 
Research in mathematical statistics ee 8.6 6.4 7.9 | 26.2 | 14.6 9.4 
Other 5.8 3.8 3.4 6.1 -—— 5.6 | 4.9 














* Includes all members from whom reports were received except those designating a combination 
of fields and those designating no field of principal interest. 


It is not surprising that the subsidiary interests of the members 
whose principal interests were in economics should fall largely in that 
field and in the various fields of business. The outstanding ancillary 
interests of these members were in general indexes of economic activity, 
manufacturing production, employment, etc., and price indexes and price 
analysis. The somewhat smaller proportion checking general economic 








334 AMERICAN STATISTICAL ASSOCIATION: 


research as a minor interest is explained by the fact that so many mem- 
bers designated that subject as their principal interest. Two fields of 
financial business, investments and security markets and banking and 
credit, appeared to occupy an important place among the interests of 
the members in economics. 

Outside of the fields of economics and business, the subject of un- 
employment appears to have been the most important, while among the 
other sociological subjects the outstanding ones seemed to be popula- 
tion research and relief. Statistics of agriculture and rural iife and research 
in mathematical statistics were each checked by about seven per cent. 
Only a few indicated an interest in subjects falling in the field of psy- 
chology and education. 

Non-financial business. The second group of subjects is comprised of 
eight branches of business operations. A total of 315 members indicated 
that their major field of interest was in one of these subjects. Of these, 
the largest numbers designated market research and advertising, with 
about 17 per cent; public utilities, including transportation, with 19 per 
cent; manufacturing production, employment, etc., with about 16 per 
cent; and business management, with 15 per cent. Smaller numbers 
underlined the other four subjects: wholesale and retail trade, interna- 
tional trade, real estate and construction, and accounting. 

The outstanding subsidiary interests of the members classified in 
non-financial business were general economic research and general in- 
dexes of economic acitvity. Considerably less interest was reflected in 
price indexes and price analysis, and still less in the other subjects in 
economics. In the non-financial business field, the outstanding sub- 
sidiary interest was shown in market research, business management, 
and accounting. The chief subject of ancillary interest among financial 
businesses was in investments and security markets. 

In the sociological field, considerable interest was shown in popula- 
tion research and vital statistics, most of which was concentrated among 
those whose major interest was in market research and advertising. 
There was only slight evidence of interest in the other sociological fields 
and in psychology and education. 

A sizeable proportion of the members in this field expressed an in- 
terest in research in mathematical statistics, control of industrial processes, 
and statistics of agriculture and rural life. The interest in mathematical 
statistics was especially prominent among those primarily interested in 
market research and in public utilities; most of the interest in control of 
industrial processes was concentrated among those whose major interest 
was in business management. 

Financial business. This group is composed of three subjects, the 
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third of which, insurance, has a number of aspects in addition to the 
financial one. There were 234 members whose major field of interest 
fell in this group. Of this number, nearly one-half designated investments 
and security markets as their chief interest; about 22 per cent reported 
banking and credit; the remaining 30 per cent indicated that insurance 
was their principal interest. 

As in the case of those in non-financiai business, the most important 
minor interests of the members in this field were general economic 
research and general indexes of economic activity. The bulk of this in- 
terest was concentrated among those whose primary interest was in 
investments and security markets and in banking and credit. Among the 
non-financial business subjects, the outstanding subsidiary interest was 
in accounting and in manufacturing production, employment, etc. 

Those whose major interests fell in the subjects of investments and 
security markets and in banking and credit reflected a large measure of 
ancillary interest in the other, but neither group exhibited any marked 
interest in the subject of insurance. Those whose primary interest was 
in insurance, however, showed considerable interest in investments and 
security markets, but only slight interest was reflected in banking and 
credit. The non-financial phases of the interest in insurance was re- 
flected in several sociological subjects, notably population research and 
vital statistics. Nearly all of the subsidiary interest in this and other 
sociological subjects among those in financial business was reflected 
by those interested primarily in insurance. 

Sociological. Of the 390 members chiefly interested in sociological 
subjects, about 29 per cent designated population research and vital 
statistics as their primary interest, and a like proportion checked relief 
and welfare services. The remaining members were approximately evenly 
distributed among the other three subjects in the field: public health 
research and administration, unemployment, and sociological research. 

The subsidiary interests of the members in this group fell primarily 
in the sociological and certain of the economic subjects. More than a 
fourth designated sociological research as an ancillary interest, and a 
similar number checked population research. Slightly smaller propor- 
tions reported an interest in public health and in unemployment. Most 
of the subsidiary interest in unemployment was recorded by those 
primarily interested in relief and welfare services; most of the minor 
interest in public health, on the other hand, was reported by those whose 
major interest was in population research. The economic subjects of 
most interest to those in this group were labor statistics, wages and 
incomes, and general economic research. It is not surprising that the 
bulk of the interest in the first two of these subjects should have been 
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reported by those chiefly interested in unemployment and in relief and 
welfare services. The members checking an interest in general economic 
research were scattered over all of the sociological groups. 

Among the-other fields, the principal ancillary interests fell in bio- 
logical research and in mathematical statistics, although considerable 
interest was also expressed in slatistics of agriculture, psychological re- 
search, and personnel and vocational guidance. As is to be expected, 
relatively little interest was expressed in most of the business subjects 
or in the control of industrial processes. 

Psychology and education. Of the 84 members who reported that their 
major interest in statistics fell in one of the four subjects classified in 
this field, a little less than 40 per cent designated psychological research, 
a similar number underlined educational research, and nearly all of the 
remainder reported personnel and vocational guidance as their chief 
interest. 

The subsidiary interests of the members in this field were for the 
most part concentrated among subjects falling in the field. There was 
large degree of mutual interest between psychological research, educa- 
tional research, and personnel and vocational guidance. Considerably less 
ancillary interest was expressed in educational administration. 

The other subjects of interest to the members in this field fell pri- 
marily in the sociological field, notably sociological research and popula- 
tion research, and in mathematical statistics. In the economic and busi- 
ness fields, moderate interest was expressed in wages and incomes, ac- 
counted for entirely by those whose chief interest was in personnel 
guidance. Some interest also was shown in market research and adver- 
tising. 

Other. The remaining 268 members who furnished information about 
their major interest in statistics aud who designated a specific field are 
classified in this general group. Aside from those who indicated some 
subject not named specifically on the questionnaire, most of those in 
the group reported that their major interest was in statistics of agricul- 
ture and rural life or in mathematical statistics. 

The 77 members whose primary interest was in statistics of agriculture 
indicated that their most important ancillary interests were in price 
indexes and price analysis and in general economic research, although 
considerable interest was also shown in general indexes of economic 
activity and in market research. Twelve of the members expressed an 
interest in agricultural experimentation, and ten checked population 
research. 

The members whose major interest was in research in mathematical 
statistics had rather widely scattered subsidiary interests. Some interest 
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was reflected in each of the economic subjects, but particularly in 
general economic research and in price indexes. Little interest was re- 
ported in business subjects with the exception of insurance, which was 
checked by nine members. Among the sociological subjects the most 
important field was population research and vital statistics, reported as 
an ancillary interest by 13 members. Seven members checked psycho- 
logical research as a subsidiary interest, and the same number expressed 
intere. in agricultural experimentation. 

The number whose chief interest lay in each of the other three specific 
subjects classified in the group was too small to provide a very reliable 
basis for measuring their subsidiary interests, but the evidence avail- 
able reflects the combinations that might be expected. For example, 
the members who checked biological research as their primary interest 
were particularly interested in mathematical statistics, agricultural ex- 
perimentation, and population research. Those whose chief interest was in 
agricultural experimentation expressed the most subsidiary interest in 
mathematical statistics, biological research, and statistics of agriculture. 
The nine members designating control of industrial processes as their 
chief interest were also interested in manufacturing production, business 
management, and mathematical statistics. 


GEOGRAPHIC DIFFERENCES WITH RESPECT TO 
CONNECTIONS OF MEMBERS 


The chapters of the Association differed to a marked degree with re- 
spect to the proportions of their members® connected with various types 
of organizations. This is to be expected, because some chapters are 
located in state capitals, others in localities that are both state capitals 
and sites of large universities, and some are in large business or indus- 
trial centers. The Chapter in Washington, D. C., is composed primarily 
of employees of the federal government. The data in Table IV sum- 
marize the percentage distributions in the chapter areas. 

Members connected with colleges and universities were relatively 
largest in number in the Chapters in Madison, Detroit,® Austin, San 
Francisco, and Boston. They were relatively smallest in number in 
Albany, Harrisburg, Washington, and New York, the first three being 
capitals and the last an important business center. The proportions 
connected with state government agencies were naturally largest in the 


8 That is, the members of chapters who were also members of the national Association. Some chap- 
ters have associate members not members of the parent organization; such members are not included in 
this tabulation. 

* The Detroit Chapter includes not only the members in Detroit but also those in Toledo and in 
Ann Arbor. The location of the University of Michigan in the latter city undoubtedly explains in part 
the relative importance of college and university connections in the chapter area. 
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TABLE IV 


MEMBERSHIP OF THE AMERICAN STATISTICAL ASSOCIATION, BY LOCALITY AND BY TYPE OF 
ORGANIZATION WITH WHICH CONNECTED* 
























































| Percentage of members in locality connected with 
nem | | Other 
. ber | Col- | Local | State Federal | '090-| Public | Mase on" busi- 
Locality of lege i E 1 I | lity | factur-| and 
— Pe govern- govern-| govern-) Poll utility “ ‘in nesses Total 
bers | uni- | ment ment ment cant OO Looe. | aus and 
. agency | agency | agency | an profes- 
versity | agency pany | trade | .. 
| | sions 
Albany Chapter | 42/ 9.5| — | 69.0| 48| —| 24] 98] — | 48 | 100.0 
Austin Chapter | 16) 500) — .0| 30 —| —| —| — | — | 100.0 
Boston Chapter | 79} 43.1] 1.3 3.8 | | 5.1] 22.8 2.5) 2.5] 1.3 | 17.7 | 100.0 
Chicago Chapter | 107] 30.8] 28 | 8.4 7.5| 11.2] 6.5! 9.4] 6.5 | 16.8 | 100.0 
Cincinnati Chapter | 17 | 23.6 | 5.9 = 5.9 5.9 — | 41.2] 5.9 | 11.8] 100.0 
Cleveland Chapter 36 | 22.2 o — | 5.6] 16.7 8.3 | 25.0] 5.6 16.7 | 100.0 
Columbus Chapter 39| 38.4) — | 23.1 17.9} 2.6] — 7.7; — | 10.3 | 100.0 
Connecticut Chapter | 38 44.7; — | 10.5 | — | %3/ — | 5.3] 2.6 | 10.5 | 100.0 
Detroit Chapter | 52] 52.0| 38 | 9.6| 1.9| 3.8] 3.8| 7.7] 1.9 | 15.4/| 100.0 
Harrisburg Chapter 17 11.8) 5.9 76.4 5.9 oe — | -- — — | 100.0 
Lehigh Valley Chapter | §| 0.0; —}/ —| — | @0/ 20; —/| — _ | 100.0 
Madison Chapter | 19] 52.6| — | 21.0; 10.5; —| — | 5.3| 10.5 | 100.0 
New York Chapter | 564| 16.7| 2.8 | 3.9| 3.9| 23.2) 8.0| 6.7| 2.8 | 31.9| 100.0 
Philadelphia Chapter | 53 | 34.0; 1.9 3.8] 11.3 18.9 5.7| 5.7 _- 18.9 100.0 
Pittsburgh Chapter | 22| 36.4/ 45 | — | 4.5| 4.5] 4.5] 18.2] — | 27.3 | 100.0 
San Francisco Chapter} 69/ 45.0| 1.4 | 18.8| 8.7] 13.0] 2.9] 14] — | 8.7] 100.0 
Washington Chapter 565 a7: &2 | 0.2 89.9 0.5 0.5 0.2 a 5.8 100.0 
| | | | 
Other areas | 4 al 43.9 | 2.0 10.0 | 14.4] 5.4| 3 4 5.9| 1.0 | 13.5 | 100.0 
| 
Total | 2,148 | 23.7} 1.7 | 7.4) 29.5] 10.6) 4.0 | 5.3 | 1.5 | 16.3 100.0 

















* Includes all members from whom reports were received except those who had retired. 


state capitals situated in Albany and in Harrisburg where there are 
comparatively few other activities. Austin, Columbus, and Madison 
had important but relatively somewhat smaller proportions connected 
with state government agencies because, in addition to being state 
capitals, they are the sites of large universities. About 90 per cent of the 
members in the Washington Chapter were connected with federal 
government agencies; about 80 per cent of the members in the employ 
of the federal government and its agencies were located in that chapter 
area. 

The Connecticut, New York, and Boston Chapters apparently had 
larger proportions of their members connected with financial institu- 
tions and agencies than any other chapter,’ although the Cleveland, 
Philadelphia, and San Francisco Chapters also had important segments 


10 Excluding the Lehigh Valley Chapter, from only five of whose members reports were received. 
Owing to the small number of members reporting, that Chapter is not included in this or subsequent 
comparisons, though it is included in the tables. 
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of their members in that field of work." The Chapters in Cleveland and 
New York showed the largest proportions employed by public utility 
companies, while members connected with manufacturing concerns 
were relatively the most important in the Cincinnati, Cleveland, and 
Pittsburgh Chapters. The Chicago Chapter had a larger proportion of 
its members in retail and wholesale trade concerns than any other 
chapter, although the Cincinnati and Cleveland Chapters were not far 
behind in this respect. 

A number of the chapters, notably those in New York and in Pitts- 
burgh, had important segments of their members in other businesses 
and professions not named specifically on the directory form. No 
analysis of the types of connections involved has been made, although 
social welfare agencies, accounting firms, and various business service 
concerns accounted for a large part of this group. The proportions in 
this class composed of the different types of organizations included 
undoubtedly varied among the chapters.” 


GEOGRAPHIC DIFFERENCES WITH RESPECT TO PRINCIPAL 
INTERESTS OF MEMBERS IN STATISTICS 


To a large extent, no doubt, there was an interrelation between the 
connections of the members of the Association and their principal 
interests in statistical data and methods. It is not surprising, therefore, 
that the differences between the chapters with regard to the principal 
interests of their members, traced below and presented in Tables V, 
paralled to some extent those discussed above relative to the members’ 
connections. 

The Philadelphia, Madison, and Cincinnati Chapters had larger 
proportions of their members primarily interested in economic subjects 
than any of the other chapters, and the Austin Chapter had the small- 


11 The importance of financial institutions and agencies in the Connecticut Chapter is explained by 
the fact that insurance companies, a number of whose home offices are located in Hartford, were classi- 
fied as financial institutions. 

12 Parenthetically, it is interesting to note the change that has occurred in the geographic distribu- 
tion of the Association’s membership over the years. The archives of the Association include a table 
showing the distribution in 1893, at which time there were 504 members. The note by Professor King, 
already referred to, showed the distribution in 1926. The following brief comparison for selected areas 
summarizes the marked shift taking place over the last half-century: 





Area 1893 1926 1940 
Number of members 504 629 2,644 
Percentage of members in: 
New England 36.4 11.9 6.4 
Middle Atlantic 28.4 45.8 35.6 
South Atlantic 10.5 13.4 29.3 
East North Central 12.7 18.6 13.3 


Other areas 12.0 10.3 15.4 
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TABLE V 
MEMBERSHIP OF THE AMERICAN STATISTICAL ASSOCIATION, BY LOCALITY AND 
BY BROAD GROUP OF PRINCIPAL INTEREST IN STATISTICAL DATA 
AND METHODS* 

















Percentage of members in locality who designated their prin- 
cipal interest in statistical data and methods in 
Number 
Locality of Non-fi- | Finan- Psy- 
members| Eco- nancial cial Soci- | chology 
nomics | _ busi- busi- ology jand edu- Other | Total 
ness ness cation 

Albany Chapter 38 18.4 18.4 5.3 42.1 2.6 13.2 100.0 
Austin Chapter 1l 9.1 27.3 _ 27.3 9.1 27.3 100.0 
Boston Chapter 61 36.1 11.5 21.3 14.7 8.2 8.2 100.0 
Chicago Chapter 92 32.6 20.7 9.8 20.7 4.3 11.9 100.0 
Cincinnati Chapter 15 40.0 6.7 13.3 26.7 13.3 a 100.0 
Cleveland Chapter 31 32.3 35.4 9.7 6.5 3.2 12.9 100.0 
Columbus Chapter 33 18.2 9.1 3.0 42.4 12.1 15.2 100.0 
Connecticut Chapter 29 17.2 10.3 34.6 20.7 10.3 6.9 100.0 
Detroit Chapter 42 26.1 11.9 14.3 14.3 16.7 16.7 100.0 
Harrisburg Chapter 16 37.5 12.5 — 43.8 6.2 —_ 100.0 
Lehigh Valley Chapter 4 25.0 25.0 25.0 _- a 25.0 100.0 
Madison Chapter 17 41.2 11.8 oa 11.8 — 35.2 100.0 
New York Chapter 477 29.4 21.0 22.2 14.0 4.0 9.4 100.0 
Philadelphia Chapter 45 44.5 17.8 20.0 15.5 _ 2.2 100.0 
Pittsburgh Chapter 17 23.5 47.0 5.9 17.7 5.9 “= 100.0 
San Francisco Chapter 60 35.0 6.7 13.3 21.7 3.3 20.0 100.0 
Washington Chapter 490 30.6 17.3 5.7 25.3 2.9 18.2 100.0 
Other areas 346 24.9 13.3 10.1 25.4 5.5 20.8 100.0 
Total 1,824 29.2 17.3 12.8 21.4 4.6 14.7 100.0 


























* Includes all members from whom reports were received except those designating a combination 
of fields and those designating no field of principal interest. 


est proportion. The latter, however, was among those having the 
largest relative number chiefly interested in the non-financial business 
field, although the proportion was exceeded by that in two other chap- 
ter areas: namely, the Pittsburgh and Cleveland Chapters. The Con- 
necticut Chapter had the largest proportion of its members primarily 
interested in financial business subjects, followed by the New York and 
Boston Chapters. 

Turning to the sociological field, the Chapters in the three state 
capitals of Albany, Harrisburg, and Columbus reported the largest 
relative numbers chiefly interested. The Columbus Chapter also had a 
substantial relative number of members whose principal interest was 
in psychology and education, although the Detroit and Cincinnati 
Chapters exhibited slightly larger proportions in this field. The Chap- 
ters in Austin and Madison had larger proportions of their members 
primarily interested in subjects in the category “other” than any of the 
other chapters. 
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TABLE VI 


MEMBERSHIP OF THE AMERICAN STATISTICAL ASSOCIATION, BY RELATION TO 
STATISTICAL OPERATIONS AND BY STATISTICAL TECHNIQUES AND METHODS 
OF GREATEST INTEREST* 
































Regarded as of principal | Regarded as of importance 
importance by but to a lesser degree by 
Consum- Consum- 
Statistical techniques and methods Pro- ers, users, Pro- ers, users, 
ducers ands ducers ond 
of teachers Totalt of tenchess Totalt 
statis- of statis- satie- of statis- 
ties tics ties tics 
Number of members 608 | 1,131 | 2,168 | 608 | 1,131 | 2,168 
Percentage of members in group regarding 
the technique or method as of impor- 
tance: 
Preparation of schedules, questionnaires, 
and record forms 40.5 15.4 23.4 27.2 20.7 22.6 
Field survey technique 19.4 13.0 14.5 17.6 13.7 15.7 
Systems of current reporting 30.4 8.7 15.0 15.1 10.6 12.3 
Accounting methods and cost analysis 7.6 9.2 8.6 11.3 9.7 11.3 
Analysis of financial statements and in- 
vestment analysis; ratio methods 7.4 13.5 11.0 10.4 13.2 12.8 
Index aumber construction 11.2 10.2 9.8 16.4 18.7 17.8 
Business cycle analysis 8.1 24.0 17.5 13.6 19.2 17.9 
Classification methods 11.7 6.1 7.9 20.0 12.0 14.0 
Non-mechanical tabulation methods 13.3 5.3 7.4 18.1 12.1 14.2 
Mechanical tabulation methods 18.7 7.2 10.1 26.3 15.6 18.8 
Computing averages, percentages, and 
rates 23.2 16.7 17.9 31.2 24.5 26.0 
Charting and graphic presentation 19.2 24.8 21.7 46.0 34.5 35.2 
Curve fitting and graduation 3.1 10.5 7.6 12.2 18.6 16.3 
Actuarial methods 3.1 6.6 5.3 5.8 6.2 6.1 
Measures of variation and correlation 6.9 30.5 17.3 20.0 22.1 21.4 
Analysis of variance and experimental 
design 2.1 13.9 8.2 6.1 11.6 9.6 
Probability and tests of significance 2 25.4 15.9 15.6 18.9 18.2 
Other 2.6 3.7 2.8 3.1 3.2 3.7 














* Includes all members from whom reports were received who furnished this information. 
t In addition to the groups shown, the total includes combinations and those who did not designate 
the group in which they felt they belonged. 


IMPORTANCE OF STATISTICAL TECHNIQUES AND METHODS 


Forty per cent of the members of the Association who were primarily 
producers of statistics reported that the preparation of schedules, 
questionnaires, and record forms was one of the techniques of principal 
importance to them. Among the members who were chiefly consumers, 
users, or teachers of statistics, 30 per cent reported that measures of 
variation and correlation were included among the most important 
methods. 

These are two of the interesting and revealing conclusions to be 
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drawn from the data in Table VI, based on the replies to the following 
question on the directory form: 


What techniques and methods do you use or find of greatest 
interest? (Underline those which are of principal importance to 
you; check those you find important but to a lesser degree.) 


The question was followed by a list of 17 specific techniques or methods 
and the traditional “other. ...” The list is shown in the table, the 
data in the left-hand part of which represent the percentages of the 
members in each column who underlined each technique as being of 
principal importance; those in the right-hand part are the correspond- 
ing percentages who checked the techniques as of importance but toa 
lesser degree. 

It is not surprising to find that the methods of principal importance 
to producers of statistics were those related to the collection and tabu- 
lation of data. The computing of averages, percentages, and rates also 
appears to have been important. In general, techniques involving the 
analysis of statistical data after they have been tabulated were of con- 
siderably less importance to producers. For example, only about two 
per cent of the group regarded the analysis of variance and experimental 
design of principal importance. 

The consumers, users, and teachers of statistics selected a different 
set of techniques. As already noted, measures of variation and cor- 
relation played an important role among those members. Others of 
principal importance were the analysis of variance and experimental 
design, charting and graphic presentation, and index number con- 
struction. 

Of importance to a lesser degree to both groups, the method checked 
most frequently was charting and graphic presentation. The technique 
checked the next most frequently by both groups was the computing of 
averages, percentages, and rates. Some variation between the two 
groups of members appeared in relation to the other methods, although 
the preparation of schedules, questionnaires, and record forms and the 
measures of variation and correlation were checked by substantial 
percentages of both groups of members. 

Taking the membership as a whole, actuarial methods appear to 
have been of importance to the smallest number of members. However, 
owing to the specialized field of application and the technical character 
of the methods, this limited interest is to be expected. Charting and 
graphic presentation seems to have been of importance to the largest 
proportion of the members, about 57 per cent having indicated a greater 
or less interest in this phase of statistical methods. 

















THE DIFFERENCE BETWEEN THE PAASCHE AND 
LASPEYRES INDEX-NUMBER FORMULAS 


By Irvine H. SreGeu 


ERSISTENT interest has been shown in the conditions which deter- 

mine the sign and magnitude of the divergence between index num- 
bers computed for the same time according to different formulas. The 
nature of the difference between the Paasche and Laspeyres indexes is 
of particular interest inasmuch as these two formulas are among the 
most frequently used of the simpler acceptable varieties, they represent 
limits between which the “ideal” and Edgeworth formulas lie, and they 
may also be regarded as components of these important “compromise” 
measures. Moreover, the two indexes, when considered in conjunction, 
have special significance in certain practical and theoretical problems 
(e.g., the determination of the “true” index of cost of living). 

In this paper, we shall present a number of expressions for the dif- 
ference between the Paasche and Laspeyres formulas which may be of 
analytical value in various investigations and also be useful in the 
study of bias. But first, we shall define the two measures. Although, for 
convenience, we restrict our discussion to temporal comparisons, the 
results are equally applicable to spatial comparisons (e.g., relative 
living costs in different regions during the same period). Let g:, g2,--- , 
gn be a set of values (e.g., prices) associated with n items (e.g., com- 
modities) in the base period ty; and let Ai, he, - - - , kh, be the correspond- 
ing values of a second set (e.g., quantities) for the base period. Simi- 
larly, let gi’, go’, - - - » gn’ and hy’, he’, - - - , hn’ be the two series of values 
for the same n items at any other time, say ¢,.! Now, the Pasache index 
of the g;’ on the base fp is: 





, , h’ 
zh! © Zgh a a Boos 
Pp Xg'h’ g g h =mX ¥ 1) 
¥ gh’ Lgh’ h’ =mY 
rgh — 


where m;=g;h; (base-year money value of 7th commodity if g; repre- 
sents price and h; represents quantity); X;=g.’/g; (price relative); 
and Y;=h,’/h; (quantity relative). In (1) and in all subsequent in- 
stances where subscripts are omitted, the summation will be under- 


1 It should be noted that a “rise” in an index between & and ¢; may, from a strict chronological 
viewpoint, represent a fall if t: actually precedes ts. In other words, a base year is not necessarily the 
“first” year of a time series, and the changes indicated by an index should be verbalized accordingly. 
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stood to extend through the series 7 =1, 2, - - - , n. The Laspeyres index 
of the g;’ on the base fy is: ; 
rgh z. 
Zg’h g =mX 
Le = = = ° (2) 
gh LYgh =m 


The analogous indexes for the h,’ will be designated Py and Ly. 


EXPRESSIONS INVOLVING WEIGHTED CORRELATION 


In Making of Index Numbers, Fisher asserted? that the Paasche price 
index is higher or lower than the Laspeyres price index according as 
“the price relatives are positively or negatively correlated with the 
quantity relatives”; and that a higher correlation coefficient “almost 
always” signifies a wider gap between the two indexes.* L. von Bort- 
kiewicz, however, showed that the simple coefficient of correlation 
between the price and quantity relatives is an equivocal criterion; and 
that the sign and magnitude of the difference may be given exactly by 
an expression involving the weighted coefficient of correlation.‘ Indeed, 
whatever the sign of the difference, it is possible for the simple coeffi- 
cient to assume a zero, positive, or negative value.’ The sign of the 
weighted coefficient, on the other hand, is always that of the difference; 
and, if the two indexes are equal, the value of this coefficient is zero. 

We now proceed to derive von Bortkiewicz’s expression. The differ- 
ence between the Paasche and Laspeyres indexes for the g;’ may be 


written: 
=mX Y =mX 





Ag = Pe = Le = 
=mY =m 
1 A =mX -omY 
= = (2mx Y - ) (3) 
=mY =m 
1 
= ——_ > X — Le)(Y — Lez). 4 
i [m( a) ( z)| (4) 


2 Dr. Fisher writes: 

“Siegel does not show that there is anything incorrect in the statement quoted from me. On the 
contrary he shows, with von Bortkiewicz, that my cautious phrase ‘almost always’ may be changed to an 
unqualified ‘always’ if the correlation coefficient is properly weighted.” ED. 

* I. Fisher, Making of Index Numbers, 1922, pp. 410-12. 

4 For von Bortkiewicz's criticism of Fisher's use of the simple coefficient of correlation, see “Zweck 
und Struktur Einer Preisindexzahl” (Pt. 1), Nordisk Statistisk Tidskrift, 1923, pp. 395 ff. In this article 
and later in “Die Kaufkraft des Geldes und Ihre Messung,” Nordic Statistical Journal, 1932, pp.1-78, 
he demonstrated the applicability of the weighted correlation coefficient to various index-number prob- 
lems. H. Staehle followed von Bortkiewicz in the use of weighted correlation in “International Com- 
parison of Food Costs,” International Comparisons of Cost of Living (I.L.0. Studies and Reports: Ser. N, 
No. 20, 1934), pp. 1-105. 

+ A general proof of the inconclusiveness of the sign of the simple coefficient as a test of the sign of 
the difference may readily be devised. In a note in this JourNaL, 1936, pp. 726-28, G. H. Evans, Jr., 
demonstrated that in the special case where the “spread” of the m,; is “not great,” the simple correlation 
coefficient and the difference Ag may both be negative. 
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The parenthetic factor in (3) and the sum of the bracketed terms in (4) 
are both in the form of the numerator of the weighted coefficient of 
correlation: 


rmry 
Tm:X-Y = ’ (5) 
ZMOm:XOm:¥ 





where 2;=X;—Lg and y;= Y;—Ly represent deviations from means 
with base-period weights; and om: x =(2mz?/=m)"? and onm:y = (Zmy* 
/=m)*? represent weighted standard deviations. From (4) and (5), we 
obtain von Bortkiewicz’s result: 


om:Y 





(6) 


Ag = lm:X-Y¥Om:X 
H 


Thus, we see that the difference may be expressed exactly as the product 
of: the weighted coefficient of correlation between the two sets of rela- 
tives; the weighted standard deviation of the relatives being averaged 
(X;); and the weighted coefficient of variation of the second set of 
relatives (Y;). Since the latter two factors are always positive, the sign 
of difference, as has already been stated, is given unequivocally by the 
sign of the weighted correlation coefficient; and this coefficient vanishes 
when the difference vanishes. Incidentally, the sign of the difference 
An =Pg—Lz is also given by the correlation coefficient in (6); the other 
factors in this case, however, are om: x/Le¢ and om:y. 

We may also express the difference in terms of other weighted cor- 
relation coefficients. For example, following Staehle,*s we may show 
that (6) is equivalent to: 


Agc= Pm! 1/X Yom! 11/xOm' jy MP ea, (7) 


where m,’=g,h;’=m:X:iY;; and M=L¢Pxy==m'/=m. The m/- 
weighted indexes of 1/X; and 1/Y;, which enter into the correlation 
coefficient and the standard deviations, turn out to be the reciprocals 
of the Paasche indexes Pg and Py. In (7) as in (6), the sign of the cor- 
relation coefficient is in perfect accord with the sign of the difference. 

A third expression is probably of greater interest and applicability 
than the preceding variant; it involves the weighted coefficient of 
correlation between the X; and 1/Y;. Such a formulation is useful, for 





In view of what has been said, it is clear that statements like the following from J. R. Hicks, 
“Valuation of the Social Income,” Economica, 1940, p. 113 (footnote), are not necessarily correct: “If 
P [Paasche price index] > LZ [Laspeyres price index] there is a positive correlation between movements 
of relative prices and movements of quantities acquired, not (as we should expect with constant wants) 
a negative correlation.” 

* H. Staehle, op. cit., pp. 15-16 (footnote). Incidentally, a misprint occurs in the expression for the 
correlation coefficient on p. 16: Pe should read P,. 
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example, in the case where the X; represent quantity relatives; the 
1/Y;, productivity relatives (i.e., the reciprocals of relatives of labor 
required per unit of output); and Pg and Lg, indexes of physical output 
with unit-labor-requirement weights (h;’ and h;, respectively) if the 
form is aggregative.’ In this case, the correlation coefficient in the ex- 
pression for the difference involves the production and productivity 
relatives and the weights (m;Y,) entering into the mean-of-relatives 
form of Pg: 

Ag = — fmy:x-1/YOmy:XOmyY :1/¥ La. (8) 


The m;Yj-weighted indexes entering into the numerator of the cor- 
relation coefficient and the standard deviations are Pg and the pro- 
ductivity index 1/Ly. The sign of the correlation coefficient in (8) is 
always opposite to the sign of the coefficients in (6) and (7), unless, of 
course, all three are zero. 

It should be noted that the weights involved in the correlation co- 
efficients and standard deviations of (6), (7), (8) are in all cases the 
products of the denominators of the correlated relatives. Thus, the 
weights in (6), m;=g;h;, are products of the denominators of X;=g;'/gi 
and Y;=h,'/h;; those in (7), m,;’=g,h,’, are products of the denomi- 
nators of 1/X;=g;/g:' and 1/Y;=h;/h,’; and those in (8), m:Y;:=gihi’ 
are products of the denominators of g,’/g; and h;/h,’. 

Finally, we may transform the above expressions by making use of 
identities satisfied by the weighted correlation coefficient which are 
analogous to those satisfied by the simple correlation coefficient. The 
following relations may readily be verified: 


2 2 2 











Om:X + Om: Y —~ Om: (X-Y) 
'm:X-Y = (9) 
20m:XOm:¥ 
2 
=1- Om: (2'—y’)/2 (10) 
Om:X 
= bm:y-x (11) 
Om:Y 
Om:Y 
= bm:x-¥ (12) 
Om:X 


In (9), o%:¢x-y) is the weighted variance of X;—Y;; in (10), 2, 
=2;/om:x=(Xi:— DTmMX/=Im)/om: x and yi! =y:i/om:y =(Yi— DmMY/ =m) 


? For a discussion of production indexes of the Paasche and Laspeyres types with unit-labor-re- 
quirement weights (if the form is aggregative) and correlative indexes of average unit labor require- 
ments and productivity, see H. Magdoff, I. H. Siegel, and M. B. Davis, Production, Employment, and 
Productivity in 59 Manufacturing Industries: 1919-1936, (W.P.A. Nat'l. Res. Proj., 1939), Part One, 


Ch. 1. 
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/om:y; and in (11) and (12), b,»:y.x and b,; x.y represent weighted co- 
efficients of regression. Although we have used the symbols m, X, Y, 
etc., the identities (9)-(12) are perfectly general; other consistent 
weights and relatives may be substituted. 


EXPRESSIONS INVOLVING SIMPLE CORRELATION 


The difference between the Paasche and Laspeyres indexes may be 
reduced to various expressions involving simple coefficients of cor- 
relation between the relatives being averaged and the weights as- 
sociated with them.® In certain special cases, of course, these expres- 
sions may explicitly involve simple coefficients of correlation between 
the two sets of relatives. 

First, we express the difference in the form: 

=rmXY =mX 


Ag = = = =(wp ~—_ wr) X = rdX, (13) 
=mY =m 





where d;=wpi—wzri=miY;/2mY—m,/=m, the difference between 
relative weights; Swp= Zw,=1; and Zd=0. The sign and magnitude 
of the difference thus depend on the sum of the d;X;, which cannot all 
be positive since some d; must be negative (unless, of course, all are 
zero). This is equivalent to saying that: 


Ag = dX = Ta-x0aoxn. (14) 


On the sign of ra.x, the coefficient of correlation between the X; and 
the differences in relative weights, depends the sign of the difference 
between the indexes. 

We may expand (14) into an expression involving the coefficients of 
correlation between the X; and the two series of relative weights: 


Ac = Nox(Tw, -xw, o Tw, -XFw,). (15) 
Again, we may simplify (15) into the following, which involves the 


coefficients of variation of the weights m;Y; and m; and the coefficients 
of correlation between the X; and these two sets of weights: 


Ag = ox| fnay-x —> — 'm-x — }; (16) 
mY m 
where MY==ZmY/n and m= =m/n. The introduction of regression 


coefficients into this expression reveals that the sign of the difference 


* The notion of correla’ion between the relatives being averaged and relative (i.e., proportion of 
total) weights associated with them receives emphasis in W. M. Persons, Construction of Index Num- 
bers, 1928, pp. 10 ff., 33 ff. 
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between the indexes is the sign of the difference between bny.x/bm.x 
and Lz: 


2 


ox 
Ag iT (bmy-x — bm-xLn). (17) 
mY 
From (16), the following expanded form may also be obtained: 
Ox = om? 
Ag= (rar xomr — Tm-xOmY¥ — Tm-XTm. YOY <), (18) 
m m 


where Y=ZY/n. 

Finally, we may derive a number of variants of (14)—(18) by making 
use of the well-known identities satisfied by the simple correlation co- 
efficient. 

OTHER EXPRESSIONS 


There are many useful expressions for the difference which do not 
involve weighted or unweighted correlation or regression coefficients. 
At least two, (4) and (13), have already been presented. A. L. Bowley 
apparently prefers (4) to forms explicitly involving the weighted cor- 
relation coefficient.® Indeed, (4) and (13) are algebraically simpler than 
the expressions thus far derived from them; but correlation and regres- 
sion coefficients are conceptually simple, have familiar interpretations, 
and, in any case, are commonly used. 

If we select a particular pair of relatives, say X, and Y,, as arbitrary 
means, we may readily deduce the following from (4): 


af =[m(X — Xi)(Y — Y3)] z (Le — X1)(La — Yi) | 
=ImY Luz 


But neither the true nor arbitrary means need explicitly enter into the 
expressions for the difference. The following form, for example, involves 
interrelations among the elements of the basic series: 


(19) 





Ag 


1 n—1 n 
Ag = ——— 2 XD Gi'9i — 9:9i')(hi’hy — hih;’), (20) 
Zm=ImY say jmitt 

where the double sum includes n(n—1)/2 terms. W. V. Lovitt has ob- 
tained a similar result.!° Division and multiplication of each term of the 
sum in (20) by mym;=g.gjh,h; yields a more significant form; the sum 

now includes the weighted products of differences between relatives: 
* See H. Staehle, op. cit., p. 16 (footnote), and A. L. Bowley, Wages and Income in the United King- 
dom since 1860, 1937, pp. 125-26. Incidentally, the direction of the inequality on p. 126 (top) should be 
reversed for the case of positive correlation between deviations from the means. For another analysis 


of the difference by Bowley, see Wages and Income, pp. 107-08. 
10 W. V. Lovitt, “Index Number Bias,” this Journat, 1928, p. 11. 
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1 
Ag = ——— 2D [mm(Xi- X)(¥i- ¥)). (21) 
=m=rmY ‘ i 
Again, noting that Y;—Y;=m,;Y;/m;—m,;Y;/m; (i.e., the difference 
between ratios of Pg to Le weights), we may consider (21) a special 
case of D. C. Jones’ general expression for the difference between two 
weighted arithmetic means." 

The forms (20) and (21) are useful in determining the contributions 
of the various elements to the sign of the difference. From either, it is 
obvious that the Paasche index is necessarily greater than the Laspeyres 
index if the rank correlation between the two series of relatives is 
perfect; for, in this case, all of the summed products are positive. The 
Paasche index is necessarily smaller if the ordinal sequence of one series 
of relatives is the exact inverse of the sequence of the other; for, in this 
case, all of the summed products are negative. 

The expression (13) may also be expanded into a double sum in- 
volving ,C2 terms: 


Ag = 2dX = 2 > d (di — d;))(X; — X)). (22) 
. @ 

From this form, it is obvious that the Paasche index is necessarily 
greater than the Laspeyres index if the coefficient of rank correlation 
between the d; and X; equals +1; and necessarily smaller if the co- 
efficient equals —1. Since the d; represent differences between relative 
weights, (22) may be expanded into the difference between two 
double sums. 

It is convenient, and for some purposes it may be useful, to express 
the double sums of (20)—(22) in the form of determinants. Thus, the 
sum in (20) may be regarded as comprising the ,C2 determinant prod- 
ucts of the form 








| gi’ g;' | h;’ h;’ 
$i i hy h; 
obtainable from the two-rowed matrices 
g:’ go’ *** Gn’ | hy’ he’ ---h,’ 
and 
91 gz **°' Qn hy he -++ hha 

















Similarly, the sum in (21) may be regarded as comprising all the 
weighted” determinant products of the form 


uD. C. Jones, First Course in Statistics, 1921, pp. 263-64. 

12 The weights mjmj; may also be considered as the values of second-order determinants drawn from 
a diagonal m-matrix of the n“ order. Such determinants would include mii ( =mg) and mj; ( =mj) in the 
principal diagonal and zeros in secondary diagonal. 
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iY; Y; 
| J 1 


b] 








where the determinants are obtainable from the matrices 


Y; Y2:-::- Yn 


y #2. 
; . and | 
. Bene? 


| 
| 1oa1---1 

















Finally, the sum in (22) may be regarded as composed of ,C2 products 
of the form 





d; d,; | Xi I 
| 1 1 1 1 
obtainable from the matrices 
|: dz---+dn |. Xo-++ Xa || 
and ; 
ca grr . Sac e D 











We conclude with an expression developed by A. L. Bowley which 
is applicable to income and cost-of-living analyses. If Pe and Leg 
represent the price indexes for the ¢; and tf) budgets, respectively, and 
Ly is the quantity index with t) weights, we have:“ 

For _— 
fap 1 Oem, (23) 
Lu 


where p=L¢—1, the relative change in the price level, according to 
the Laspeyres index, between to and t:; v= M—Leg, the difference be- 
tween the expenditure (or money-income) and Laspeyres price indexes; 
s;=X;,;—1, the relative change in the price of the ith good; f; is the 
proportion of the total expenditure allotted to the ith good at to()_f =1); 
and 9;=g:(hi’ —h;)/h:(g:’—g;), the (unsigned) price elasticity of de- 
mand for the ith good when the differences are small. If all the n;<0, 
it may be seen from (23) that: Pg<Le¢ when pv>0, which is the case 
when total expenditure (or money income) increases or decreases more 
rapidly than prices in general; Pg<L¢ when p=0 or v=0, but some 
s;~0; and Pg>Lg¢ when p and v are opposite in sign, provided the n; 
are not very large. 





18 The expression (23) is a modification of Bowley'’s formula in Wages and Income, p. 126; it is in- 
correctly given in G. Lutfalla, “Compte Rendu de la Réunion d’Annecy, 12-15 Septembre 1937,” Econo- 


metrica, 1939, p. 86. 




















ON SAMPLE INSPECTION IN THE PROCESSING OF 
CENSUS RETURNS 


By W. Epwarps DEMING AND LEON GEOFFREY 
Bureau of the Census 


ERTAIN COUNTS of the census are required to be in exact conformity 
C with the enumeration, and the processing of any such data must 
be completely checked. In the preparation of certain other data of the 
census, however, it is possible to introduce sample inspection in those 
stages of the processing that are carried out with such accuracy and 
uniformity that 100 per cent inspection would be an unnecessary and 
wastefu: refinement of the information supplied by the enumerators. 

This article describes an application of a system of sample inspection 
whereby suitable action criteria control the processing of the returns of 
the 1940 population and housing census, in those stages wherein exact 
conformity with the enumeration is not required. The object of the 
sample inspection is to satisfy consumer tolerances with a minimum 
cost of inspection. The results show that such procedures, ordinarily 
associated chiefly with the product of machines, are also applicable to 
the product of direct human effort. The approach remains the same; 
namely, the examination of small portions of product in rational sub- 
groups, taken in order of time, with the object of attaining and main- 
taining control of the quality within desired specifications. 


VERIFICATION IN THE PROCESSING OF CENSUS DATA 


Statistical tables are the result of processing the returns. The Bureau of 
the Census is a statistical factory. The main product is statistical tables 
for the use of other government agencies, social and economic research 
organizations, distributors of goods, and for analysis by the census staff, 
etc. The schedules that were turned in by the enumerators who did the 
field work must be processed, in order that the data collected may be 
made into the finished product. In many ways this processing of the 
census schedules resembles a belt-line system in a factory, where dif- 
ferent operations are performed, and each operation is completed before 
the unfinished product is moved on to the next one. 

A comparison of some of the census terms with the corresponding 
industrial terms may be useful at this point. 
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Industrial term 


Process 
Inspection 

Defect 
Fraction defective 
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Census term 
Operation 
Verification 
Error 
Error rate (e.g., the number of 





wrong cards per 100 cards 
punched) 
All defective parts discovered are All errors found are corrected 
adjusted or replaced with good 
ones 


The necessity for verification. In centralized inspection, the work is 
sent to a different part of the factory for inspection. In decentralized 
inspection the work is inspected by people working right along with 
the operators, and under the same supervision. The census uses cen- 
tralized inspection in the verification of coding, and decentralized 
inspection in the verification of card punching. Coders and punchers 
make errors, and it has therefore long been the custom of the census 
to verify their work at each operation, in order that the published 
statistical tables may have the desired accuracy. To correct an error 
at the least expense it must be caught immediately after the operation 
in which it has been made. 

Verification need not always be 100 per cent. A verification process has 
three main functions: 

i. To find errors and correct them. 
ii. To influence individuals not to make errors 
a. by bringing to their individual attention both the nature 
and the quantity of the errors that they are making. 
b. by applying penalties to their efficiency ratings for these 


errors. 
iii. To find out how many errors are being made by each worker. 


It may happen in some operations that the second function is carried 
out so effectively that in view of the nature of the data being processed, 
the first function is unnecessary. Under such circumstances it is suffi- 
cient to use a verification system that accomplishes only the second 
function, i.e., maintains control of the quality. Verification on a sample 
basis is satisfactory for this purpose in many operations, and at sub- 
stantial savings. This paper describes a system of verification on the 
basis of samples randomly chosen from work that is already known to 
be uniformly accurate to a certain degree (cf. the section “Criteria for 
selecting work for sample verification”). 
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Sample verification is a control process. Sample verification provides 
data for control of the error rate of the individual worker. It does not 
eliminate all errors, and therefore can not be used in any part of the 
work that is required to be in exact conformity with the enumeration, 
as for example, the counts of the population according to geographic 
areas, which by definition serve as the basis for Congressional re- 
apportionment; for such critical items of information, sample verifica- 
tion will not suffice; the processing of such data must be verified 
completely at every stage. 


MORE FINISHED PRODUCT MADE POSSIBLE BY SAVINGS 

The following estimates of the savings effected by sample verification 
of census processes refer to the savings in direct labor cost. There are 
additional savings, not estimated here, that result from reduction in 
the cost of supervision, use of machinery, amount of floor space occu- 
pied, and overhead expenses. These additional savings more than 
counterbalance the cost of the planning and administration of the 
sample verification procedure. 


Census process to which sample verification Savings that were ef- 


was applied fected by sample veri- 

fication 
Preliminary employment transcription $ 3,000 
General coding of the population schedules 82,000 
Occupation coding of the population schedules 68 ,000 
General coding of the housing schedules 15,000 
Punching of the individual population cards 73 ,000 
Punching of the housing dwelling cards 22 ,000 
Total savings $263 , 000 


These savings will be further augmented by the application of sample 
verification in other operations that are yet to be carried out (family 
transcription, punching of the family card, and other work). Any savings 
effected in the processing of the returns enable the Bureau of the Census 
to carry out more tabulations and studies than would otherwise be pos- 
sible. 

It should be remarked that the savings effected through sample veri- 
fication are accomplished by increasing the production of a given force 
of workers, not through pressure to increase their productive effort, but 
through more efficient use of their abilities. 

Sample verification not only conserves funds but advances the date 
of completion of any operation in which it is used. 
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SOME REQUIREMENTS OF SAMPLE VERIFICATION 


Reliable performance essential for sample verification. It is considered 
as basic that any independent producing unit (coder, puncher,machine) 
must have a history of consistently accurate production, determined 
from an objective record, before sample verification can take the place 
of 100 per cent verification. If previous performance has shown the 
work of a certain producing unit to be accurate within either the control 
limits or the administrative tolerance limits (cf. the section “Criteria 
for selecting work for sample verification”), small but frequent samples 
of the product will suffice for maintaining control of the production 
process. This is so in spite of the fact that the same size of sample from 
a single day’s production may not be accurate enough to determine the 
error rate for that one day within the accuracy that may be required, 
even when the process is controlled. A producing unit that is known to 
be uniformly accurate enough for sample verification is of enhanced 
value in an organization because the output of such a unit can be in- 
spected at reduced expense. For this reason, workers whose output 
can be subjected to sample verification may be given preference in 
promotions and retention in the service. 

Records and supervision. The fundamental requirement for the proper 
application of sample verification in the processing of census returns is 
that the work be of uniformly good quality. To be sure of this, it is 
necessary to keep a running record of the production and errors of each 
individual, and to watch each record carefully day by day, or week by 
week. It was found essential also that sample verification be conducted 
under special supervision; care was taken, however, to have the 
specialized supervision carried out in cooperation with the regular 
supervisors and section chiefs, and with their direct assistance whenever 
possible. 

There are many ways in which sample verification may become a 
liability if it is not carefully supervised. The net savings that are to be 
effected by sample verification can be obtained with safety only if an 
ample amount is deducted from the gross savings for adequate records 
and supervision. It is a great temptation for an administrator to press 
continually for fewer records and less supervision. 


SAMPLE VERIFICATION OF PUNCHING 


Organization of the work. Description of the actual working of sample 
verification will be confined to the procedure for card punching, since 
the same principles apply to the sample verification of the coding opera- 


tions as well. 
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The card punching is done by operators working in sections. A sec- 
tion consists of 20 operators, a section chief, and an assistant section 
chief. The section chief is responsible for the punching of the cards 
from the schedules assigned to him, and also for the verification of these 
cards (decentralized inspection). Some of the operators in his section are 
therefore designated as verifiers and the remaining ones as punchers. 

The punch cards. In the operation of punching, the information re- 
ported by the enumerator is transferred to punch cards. Before the 
schedules are sent to the punchers, all but the simplest items of in- 
formation have been translated into codes in the coding operations. 
The codes and editorial changes are written directly on the schedule 
with colored pencil. Both the housing and population punch cards are 
45-column cards, 12 positions to a column. 

After the cards have been punched they are sent along with the 
schedules to a verifier for inspection. The punching machine has cut 
the holes in the cards, and the verifying machine is used to test whether 
these boles are in the proper positions. The motions required to verify 
a card are almost the same as those for punching that card. 

Selection of the sample. One verifier in each section is able to verify on 
the sample basis the work of all the “qualified” punchers in the section. 
A qualified puncher is one whose work is good enough to be eligible for 
sample verification, as is elaborated later. The verifier examines one- 
twentieth of the cards for each unit! of a qualified puncher’s work, 
taking, on the average, one card out of every eight minutes’ work of a 
puncher. In the sections working with the housing schedules, every 
twentieth card is verified, the starting point being randomized and 
being designated to each verifier daily so that neither the verifier nor 
the puncher knows in advance which cards are to be examined. Since 
cards are numbered on the reverse side, the verifier can be held responsi- 
ble for accurate verification of particular cards. A similar system, neces- 
sarily a little more involved, is used to select the population cards. 

Work that is subject to sample verification flows through the verifica- 
tion process more than six times as fast as work that is not. The factor 
is six, not twenty, because of the time required in picking out the cards 
for sample verification, and because there are certain overhead motions 
that must be performed for every folio regardless of whether it is sub- 
ject to sample verification (opening it and setting it in the holder, 
making entries on the record forms, moving the boxes of cards, etc.). In 


1 The unit of work is a folio. One population folio contains the schedules for the territory assigned 
to one enumerator. From 1 to 3,000 or more persons may be enumerated on the schedules contained in 
one folio; the average number is about 900, a little less than a day's punching for one puncher. One hous- 
ing folio contains the housing schedules for the territories assigned to four enumerators. 
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addition, a small proportion of the work that is subject to sample 
verification is verified 100 per cent (see i. in the section entitled “Cor- 
rective effects,” infra) and the time charged to sample verification. 
Punchers differ with regard to accuracy. Punchers might be classified 
into four groups with respect to sample verification: i. learners; ii. those 
who have had the opportunity to learn, but who have not demonstrated 


CHART I 


THREE ERROR RECORDS, ILLUSTRATING THREE DIFFERENT LEVELS OF 
QUALITY AND VARIABILITY 


Top: never likely to become qualified for sample verification. 
Middle: too much variability to become qualified. 
Bottom: eligible for sample verification. 
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ability to punch with a consistently low rate of error; iii. punchers who 
are accurate but who are likely to stress speed at the expense of ac- 
curacy unless they are penalized for errors; iv. those who are accurate 
and would remain so regardless of whether a record of errors were kept. 
Chart I shows the error rates of three punchers; the top one is consis- 
tently high, and the middle one is erratic. These fit description ii. The 
error rate of the bottom one is consistently low and fits description iii. 
or perhaps iv. Sample verification is restricted to punchers like this one, 
to keep the error rate low and under control. 

Criteria for selecting work for sample verification. Three objective 
factors can be used to select punchers whose work is of sufficiently high 
quality for sample verification, viz.— 


i. Length of experience. 
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ii. The average error rate (number of wrong cards? per 100 cards punched) 
for each week during this experience. 
iii. Fluctuations in the error rate from day to day and from week to week. 


The criteria developed by experience for the minimum requirements 
for qualifying punchers are shown below, along with the requirement 
for disqualifying. 

i. To qualify—‘“At least two of the last four weeks must show an average 
error rate of not more than 1 wrong card per 100 cards punched, and no 
week of the last four shall show an average of more than 2 wrong cards 
per 100 cards punched. (Weeks during which fewer than 2,000 cards 
were punched will not be counted.) In addition to the above, only one of 
the last four weeks may include a folio for which there were more than 
3 wrong cards per 100 cards punched. (Folios of fewer than 300 cards 
will not be counted.)” 

ii. To disqualify—‘*A puncher will be dropped from sample verification if 
the average error rate for any week, determined from samples of her 
work, exceeds 3 wrong cards per 100 cards punched, or if it exceeds 2 
wrong cards per 100 cards punched for each of two weeks out of the last 
four.” 


The numerical values of the limits involved in these criteria were de- 
termined by consideration of two factors: 

i. The level of error that could be allowed without more than negligible 
impairment of the finished product (level of error demanded by the con- 
sumer). 

ii. The proportion of punchers that would qualify at any given level (level 
of error that can be met by the producer). 


Graphic records. Error rates were plotted on a weekly basis (Charts I 
and II). The work of each puncher was verified 100 per cent until the 
graph showed that the degree of accuracy required for sample verifica- 
tion had been achieved (supra). From this point on, her work would be 
verified on the sample basis, and the error rate plotted; the graph served 
to show whether the previously satisfactory performance was continuing. 
For example, the puncher whose record is shown in Chart II became 
eligible for sample verification as soon as her performance met the re- 
quirements for the four-week period. 

Amount of punching subjected to sample verification. Supervisional 
staff. Sample verification of the punching was in operation for a period 

2 For administrative convenience, error rates are expressed in terms of wrong cards. A wrong card 
may have one or more errors (wrong punches), but usually there is only one. The average number of 
puncher’s errors on a wrong population card, determined from a study of 25,000 wrong cards, was 1.47. 
Thirty-nine columns are to be punched on each card, the remaining six of the 45-column card being cut 
mechanically for identification. There are thus 39 chances for error on each card. An error rate of 1 


wrong card per 100 cards punched should therefore be interpreted as 1.47 errors per 3,900 chances of 
making an error, which is far within the limits of error in the original data. 








358 AMERICAN STATISTICAL ASSOCIATION: 


of seven months, during which 51,000,000 population and housing punch 
cards were subject to sample verification out of a total of 175,600,000 
cards punched (29 per cent). At the peak of activity, records were being 
kept for 1,265 punchers and 498 verifiers. The maximum number of 
punchers that were qualified for sample verification at any one time 
was 473, or 39 per cent of all punchers at that time. Seven people were 
needed to maintain the records, three to train and supervise the veri- 
fiers, two to make studies of the error records and special problems, and 
one to supervise the project under the direction of the mathematical 


adviser. 
CHART II 


WEEKLY ERROR RECORD OF A CARD PUNCHER 
The error rate was high over the learning period, and erratic on occasions shortly thereafter, as 
during the fourth and sixth weeks where new situations were met. As sources of error are eliminated the 
curve becomes steady and exhibits the property of stability, as shown by the fact that the points stay 
within the control limits. 
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Of all the punchers that qualified, only 13 were later disqualified 
in accordance with the rule for disqualifying (supra). The disqualified 
punchers were given the opportunity to qualify again, but as a matter 
of experience, only 4 of the 13 were ever able to do so. 


CONTROL OF THE ERROR RATE 


Corrective effects. The error rate of work that is sample verified is not 
only controlled at a low level, but the work itself is subject to certain 
corrective effects. In the first place, any errors that are found by the 
verifier in the (5 per cent) sample are corrected. In addition, certain 
other corrective effects arise, chiefly from i. reverification; ii. visual 
verification; iii. machine rejection; and iv. compensatory errors. 
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i. The sample verifier is instructed to verify all the cards for a 
folio whenever the error rate of the sample is in excess of 3 
wrong cards per 100 cards. In practice this results in the re- 
verification of 2 per cent of the work of the qualified punchers. 
This rule serves to give an occasional spot check on the system of 
sample verification. Owing to the method of selecting folios 
for reverification, the correction of all errors that are found in 
this 2 per cent of the work of the qualified punchers removes 
more than 2 per cent of the total number of errors, and the 
average rate of error that is permitted to pass is thereby 
lowered. 

ii. Visual verification of certain columns can be accomplished by 
holding together all of the cards for a schedule and looking 
through the holes for which the punch is supposed to be in the 
same place on each card. This procedure requires practically 
no expenditure of time, and enables the sample verifier to in- 
spect on the 100 per cent basis more than 7 per cent of the 
work of a qualified puncher in addition to the 5 per cent sample 
and the reverification of folios found in sample verification to 
exceed the allowable error limit. 

iii. The sorting and tabulating machinery is set to reject cards 
having certain inconsistent punches, and these cards are cor- 
rected. The number of such rejections and the cost of critical 
analysis were not noticeably increased, however, as a result of 
sample verification. 

iv. The effect of one error upon a tabulated frequency distribution 
may be canceled by another. 

Statistical control. It is interesting to note that for over 90 per cent 
of the qualified punchers, subsequent to the date of “qualifying,” the 
points plotted to represent the average weekly error rates (Chart II) 
fall entirely within the Shewhart “3 sigma” control limits. The 3 sigma 
limits are computed as +30 where for any puncher # is the average 
number of wrong cards per 100 cards punched during the entire period 
covered by the control chart, ¢ =+/ (pg/n), and # is the average number 
of cards in the 5 per cent sample of one week’s work. g=1—9. 

Administrative tolerance limits (the criteria for qualification and dis- 
qualification; supra) were the same for all punchers and they regulated 
the number of punchers qualified at any one time. Had the punching 
been of longer duration, individual control limits could have served as 
a tool for the discovery of sources of error in the work of those punchers 
that were within the administrative tolerance limits. 
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When the circumstances associated with a point that falls above the 
upper control limit +30 are investigated, a plausible explanation is 
usually found. Three typical cases are outlined below. 

i. Puncher No. 1417, May 12-17; punched folios that had been 
very poorly filled out by the enumerator—entries difficult to 
read, not in proper position. 

ii. Puncher No. 1483, May 12-17; had just returned from a siege 
of measles. 

iii. Puncher No. 1859, June 2-7; high error rate laid to sickness 

in the family, which distracted her interest. 
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NEW FEATURES OF THE 1940 POPULATION CENSUS* 


By Lzeon E. TruespE.., Chief Statistician for Population 
Bureau of the Census 


BRIEF REVIEW of some of the more important changes embodied in 
A the Censuses of 1930 and 1920 provides a background that is use- 
ful in considering the new features of the 1940 Census. 

The Census of 1920 followed very closely the pattern of the Census 
of 1910, the principal new features being the following: The tabulation 
of the farm population by states only, not by counties; more nearly 
adequate publication of the occupation data, although with a simpler 
classification; and a special inquiry on the amount of mortage debt on 
mortgaged homes. 

The new features in the 1930 Census, as compared with that of 1920, 
included: The unemployment inquiries, which were carried on a sepa- 
rate schedule; the question on value or rent of home, which was perhaps 
more widely appreciated than any other one of the new features; ex- 
tensive tabulations of family data, which were made through the use of 
an additional punch card; the separation of all figures for rural popula- 
tion into rural-farm and rural-nonfarm portions; the publication of a 
brief classification of the population of townships and small incorpo- 
rated places; data for gainful workers by counties; a fairly detailed age 
classification for the population of counties and small cities; a tabula- 
tion of occupations by industry (similar to a tabulation made in 1910 
but not made in 1920); a brief classification of the population of metro- 
politan districts; and additional tabulations for the foreign stock, in- 
cluding classification by age, literacy, ability to speak English, and 
marital status, all by country of origin. 

Some of these new features represent additions to the schedule or 
entirely new tabulations. Others represent simply extensions of the 
detail, or of the geographic areas represented, in tabulations which had 
been made in some form for many decades. Judged by the comment 
from those who have used the data and by the demand for additional 
expansion, the latter are hardly less important than the former. 

Among the changes incorporated in the 1940 Census there are per- 
haps six which deserve more extensive mention than the others, namely, 
the housing inquiries, the sampling procedure, the incorporation of the 
questions on employment and unemployment into the main population 
schedule, the section on migration, the publication of reports for census 


* A paper presented at the 102nd Annual Meeting of the American Statistical Association, Chicago, 
December 28, 1940, brought up-to-date and coordinated with the other papers of this session. 
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tracts in 61 cities, and greatly expanded reports for metropolitan dis- 
tricts. 
HOUSING INQUIRIES 

The housing inquiries were set up nominally as a separate census, 
partly because they were added to the program by legislation enacted 
very late (the appropriation, in fact, not being finally approved until 
April 6, 1940) and partly because of the essentially different nature of 
the questions. The information on the two schedules, Population and 
Housing, was obtained, however, by the same enumerators from the 
same source at the same time, and a number of the questions were of 
common interest to the two inquiries. One important classification of 
housing units, for example, is that based on the number of persons in 
the household occupying the unit, and conversely, many of the items 
descriptive of the housing accommodations occupied are significant for 
any analysis of the population data which requires measurement of the 
economic bases of family living. 

The housing inquiries, it may be noted, are not altogether new, as 
some of them have been carried on the population schedule since 1890, 
the results being published in the population volumes under the head- 
ing “Dwellings and Families.” The value-or-rent inquiry, which was 
incorporated in the 1930 population schedule, represents in particular 
an excursion into the field of housing statistics. 

It is probable that if provision is made at the next census for housing 
items as well as for the usual population items, the two will be combined 
on one schedule. This schedule may possibly be made up in the form of 
a separate sheet for each household, in place of the traditional multiline 
sheet with a line assigned to each person or each housing unit. Extended 
consideration was given to the use of a separate sheet for each unit in 
the present housing census, but it was finally decided that, since the 
two inquiries were to be handled by the same enumerator at the same 
time, the housing inquiry should be set up in a form parallel with that 
already established for the population inquiries. 

The separate household schedule has manifest advantages, especially 
in the handling of the enumeration. There are also very serious dif- 
ficulties attendant on its use, the most serious being that involved in 
making sure that all the separate sheets for a given area are properly 
assembled. It is also more difficult, or at least more expensive, to handle 
the separate sheets in the various processes of coding and tabulation. 
The advantages and disadvantages of this radical change in the form 
of the census schedule need, therefore, very extensive consideration 
before any change is actually made. It may be noted that a separate 








-New FEATURES OF THE 1940 PopuLATION CENSUS 363 


household schedule is used in most of the European censuses, and that 
the use of a separate sheet for the final schedule of a household which 
needs to be enumerated in one place and allocated to another some- 
what simplifies the process of allocation. 


SAMPLING PROCEDURE 


The sampling techniques embodied in the 1940 census program were 
adopted primarily as a means for handling additional questions beyond 
the number that could be carried on the schedule in the ordinary way. 
These questions are presented in a brief section at the bottom of each 
page of the schedule, containing two lines on which additional entries 
are made for a pre-selected two of the forty persons represented by the 
entries in the main part of the schedule. 

Once this five per cent sample was designated, however, it became 
evident that it would have further uses. Thus the Bureau had obtained 
from it in January, 1941, preliminary figures on sex, color, age, and 
work status, including employment and unemployment—figures which 
would not otherwise have been available for the United States before 
December. These were obtained by punching a brief special card from 
the entries on the main population schedule for the persons designated 
for sample inquiries. It will be possible, later, to make, on a sample 
basis, very detailed tabulations which the limitation of time and 
funds would make entirely impossible on the basis of handling the 
132,000,000 cards representing the entire population. These can be 
made either from the cards punched for the sample lines, which will 
contain most of the main schedule items in addition to the special 
questions, or from the sample-person cards, which are identified and 
can be sorted out from the main series of population cards for such 
special runs. Further details with regard to the sampling program are 
given in an article appearing on pages 369 to 375 of this JouRNAL. 


EMPLOYMENT AND UNEMPLOYMENT 


The various new inquiries relating to work status, including employ- 
ment and unemployment, which, as already indicated, form one of the 
most important of the 1940 innovations, are likewise discussed in 
another article in this JouRNAL (pages 381 to 386). 


MIGRATION 


The material for the so-called migration tabulations has been ob- 
tained through the use of a question asking, for each person five years 
old or over. the residence on April 1, 1935. These 1935 places of resi- 
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dence are to be tabulated in combination with the place of enumeration, 
that is, the place of residence in 1940, thus giving for the first time 
direct statistics of specific population movement. 

The first tabulation will be made by grouping the cards, on the basis 
of 1940 residence, into about 300 rather large areas, made up, substan- 
tially, by taking each city of 100,000 or more separately, and dividing 
the remainder of each state into three groups comprising urban, rural- 
nonfarm, and rural-farm areas, with a few additions representing 
metropolitan districts for the largest cities. The cards representing 
persons living in the same piace in 1935 as in 1940, or in the same 
county, will then be sorted out; and the remainder, representing the 
migrant population, will be sorted on the basis of 1935 residence into 
groups representing the same 300 areas. The cards will then be run 
through a tabulating machine on which will be set up color, sex, age, 
education, citizenship, work status, and broad occupational groupings, 
to provide data on the general characteristics of the migrant groups. 
From this tabulation can be obtained figures representing migration 
both in and out between any two of the 300 areas. There will have to be, 
of course, very considerable consolidation for publication purposes, but 
in order to get both in-migration and out-migration it is necessary to 
make the tabulation complete in all the detail required for any part 
of it. 

It is planned to make, also, two other tabulations showing migration 
for smaller areas, including a series of subdivisions of states which have 
been worked up for this purpose. From this second tabulation figures 
will be obtained in the maximum geographic detail only for in-migra- 
tion by place of origin. In other words, the detail of the sort by place of 
residence in 1935 will be much greater than the detail for 1940 resi- 
dence. All the cards for migrants enumerated in the city of Cincinnati, 
for example, will be sorted into perhaps fifty or sixty packs representing 
the subregions of those states which are shown by the first migration 
tabulation to have contributed most to the population of Cincinnati. 

Tabulations of this kind will be rather expensive and time-consum- 
ing, and it is probable that not so many as would be desired can be 
completed within the census period. It is proposed also, if time and 
funds permit, to make a third series of tabulations involving small 
areas, in which the cards will be arranged for large areas on the basis of 
1935 residence, and sorted into small areas on the 1940 basis to give 
figures for out-migration corresponding to the detailed in-migration 
figures obtained from the second migration count. For this tabulation, 
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with Cincinnati again taken as an example, all cards for migrant per- 
sons who lived in Cincinnati in 1935 would be assembled from the 
various areas of 1940 residence and sorted in accordance with the sub- 
regions in which they were found living in 1940. 

The principal source of migration figures in the existing census re- 
ports is the tabulation of the native population by state of birth, from 
which, by comparing the figures obtained in one census with those ob- 
tained in another earlier census, approximate net migration figures can 
be derived. The tabulation by state of birth is an expensive one, since 
it involves running through machines the entire 132,000,000 individual 
cards; and there has been some suggestion that it be discontinued, in 
view of the presumably much better migration figures to be obtained 
from the tabulations just outlined. Because the state of birth figures 
are available decennially since 1850, however, and because they have 
been extensively used in the past, there is a real argument against the 
omission of this tabulation for 1940. 


CENSUS TRACT REPORTS 


Census tracts are small areas which have been set up in a consider- 
able number of the larger cities for statistical and administrative pur- 
poses, through cooperation between the Bureau of the Census and a 
local committee in each city. The tracts have, for the most part, a 
population ranging from 4,000 to 8,000 and are to be maintained with a 
minimum of change from census to census, so that changes in the popu- 
lation or the characteristics of these small areas may be statistically 
recorded. Census tracts were first set up in New York City in connec- 
tion with the Census of 1910, and in 1930 figures were tabulated by 
census tracts for 18 cities. Even in 1930, however, the publication of the 
census tract figures was left to the local committees, which also paid a 
considerable part of the added cost of making the tabulations for such 
small areas. From the point of view of the widest possible usefulness of 
the tract tabulations, this arrangement was not altogether satisfactory 
since the figures were published and made generally available in only 
a few of the cities. Between 1930 and 1940 the list of cities in which 
census tracts had been laid out was increased to 61, including the 37 
cities which had 250,000 inhabitants or more in 1930. In order to make 
sure that the 1940 census tract tabulations are conveniently available 
to all possible users, the census tract tables are to be published by the 
Census Bureau in the form of a separate report for each city. This series 
of reports will form a part of the official census reports on population. 
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METROPOLITAN DISTRICTS 


Metropolitan districts are in a sense the converse of census tracts, 
being areas including both a central city (or cities) and certain adjacent 
territory with some urban characteristics and close connections with 
the central city. These areas are established primarily by the Bureau of 
the Census by the relatively simple process of including adjacent minor 
civil divisions (smaller cities, townships, etc.), so long as they have a 
population density of 150 per square mile, with some modification of 
the area thus mechanically established, based upon the recommenda- 
tions of local committees. Metropolitan districts were first established 
for use in connection with the Census of 1910; and statistics for 96 such 
districts, giving the total population and limited classifications by sex, 
color-nativity, and age, were published in a single volume for 1930. 
Between 1930 and 1940 additional metropolitan districts were estab- 
lished, bringing the total up to 140. As a result of this expansion all 
cities of 50,000 or more are included in some metropolitan district. It is 
planned to publish in the 1940 reports a very much larger amount of 
data for metropolitan districts than was published in 1930, and to in- 
clude this material with the other material for the state in which the 
district is located, rather than to present it only in a separate volume 
for metropolitan districts. The detail to be presented for the larger 
metropolitan districts will approximate that presented for states in the 
Second Series Bulletins for population and housing; and even for the 
smaller districts there will be much more material than in the 1930 
report. Maps of the districts, or maps showing the minor civil divisions 
of which the districts are made up, will be presented in connection with 
the statistical tables. 


OTHER NEW FEATURES 


There now remains space for little more than a listing of the more 
important of the other new features embodied in the 1940 Census. 
Among other new questions on the schedule are: The question on wage 
income; the question on last grade of school completed, which takes the 
place, physically, of the traditional question on illiteracy but provides 
data on educational status which will be significant over a broad range, 
for every population group, in contrast with the two-way classification 
of the illiteracy tabulation (which so often had little value because less 
than 1 per cent of the group fell into the significant classification) ; and 
a question designed to separate wage or salary workers employed by 
any governmental organization from those working in private industry. 
In this connection also might be mentioned the extension of the work 
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status classification to cover the entire population 14 years old and 
over, by providing appropriate classifications for those persons outside 
the labor force. 

There will also be a continuation of the expansion of the amount of 
detail published, botii geographic detail and detail in classification, 
which formed a considerable part of what was new in the 1930 census 
program. The most spectacular expansion of this kind is the proposed 
publication of selected items from the housing census by blocks for 
cities of 50,000 or more. The smallest area for which census figures have 
heretofore been published is the census tract, and the smallest area for 
which data have been furnished in special compilations has been the 
enumeration district. The city block figures are essential, however, for 
many purposes which the housing census is to serve, and arrangements 
are being made not only for publication of the block data, but also for 
the preparation of block maps, both base maps and analytic maps. 

The enumeration district tabulation, from which are obtained the 
data published by townships and small incorporated places, has been 
expanded so as to show age by sex, with somewhat more age detail than 
in 1930. The age series for counties and the smaller urban places has 
been expanded to include 5-year periods all the way up to 75 years. 
The tabulation of workers for counties and the smaller cities has been 
expanded to include not only industry, as in 1930, but also a brief 
occupation classification and the basic employment status classification 
which forms one of the principal innovations of the 1940 schedule. 
There are numerous other scattered additions, but the items listed may 
serve to illustrate this type of expansion in the proposed Census Re- 


ports. 
MONOGRAPH PROGRAM 


Plans are being made for a series of monographs or special studies 
somewhat like the series based on the 1920 Census, except that the list 
of titles under consideration is considerably longer and that plans are 
being made to have the monographs prepared either during the census 
period, which ends December 31, 1942, or at least immediately follow- 
ing the close of that period. It is hoped that the population monographs 
may be prepared for the most part by members of the present profes- 
sional staff, with advisory assistance from a number of outside experts 
who have been appointed as consultants. 

The list of monograph topics under consideration includes Some that 
are primarily historical, like one which might be entitled “One Hundred 
and Fifty Years of Population Growth.” This would incorporate some 
of the material of Mr. Rossiter’s Century of Population Growth (1790- 
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1900), which is now out of print, and would analyze the changes since 
1900, perhaps with especial attention to the last decade, in which popu- 
lation trends have shown so many radical changes. A similar historical 
treatment of the data on families, from 1890 to 1940, might be con- 
sidered. The list includes also at least one critical monograph, which 
might be entitled “An Evaluation of the Returns of the 1940 Census 
of Population,” and in which all possible critical material, including the 
results of some tests of completeness of enumeration and accuracy of 
returns which are now in process in the Bureau, would be presented 
and discussed. Most of the candidates for inclusion in the list of mono- 
graph titles are, however, detailed analyses of some rather restricted 
field in the 1940 data themselves, such as “Unemployment in the 
United States in 1940”; “The National Labor Force of the United 
States in 1940”; “Internal Migration in the United States between 
1935 and 1940”; “The Characteristics of the Aged Population of the 
United States in 1940”; “Regional Differences in Housing in the United 
States: 1940”; “Plumbing, Heating, and Lighting Equipment of Ameri- 
can Homes: 1940”; or “Housing Vacancies in the United States in 
1940.” Finally, as an example of a monograph planned directly to aid 
in the use of the data presented in the regular Census Reports, or avail- 
able from unpublished sources, may be mentioned the topic, “The Use 
of Housing Census Data in the Analysis of Real Estate Conditions in 
a Community.” 

In general, while the monographs will present some tables of specially 
tabulated data, they will consist primarily of analytical and interpreta- 
tive comment, presented either from the point of view of general scien- 
tific interest or from the point of view of the user of census statistics, 
either in business or in governmental or educational organizations. 








THE USE OF SAMPLING IN THE CENSUS* 


By Puiuip M. Hauser, Assistant Chief Statistician for Population 
Bureau of the Census 


ANKING WITH the most important of the innovations in the 1940 

Census of Population is the introduction of sampling procedures 
in the collection of information, in the tabulation of quick preliminary 
results, in the preparation of detailed cross-tabulations, and in the 
verification of large scale census clerical operations. 


COLLECTION OF INFORMATION 


As is the case in every census, the demand for new inquiries to be 
added to the 1940 population schedule far exceeded the limits pre- 
scribed by available time and money. After long weeks of conference 
and study resulting in the determination of the subjects to be investi- 
gated, it was clear that it was impossible to include for complete census 
enumeration a number of inquiries urgently needed by both govern- 
mental and private sources. It was also clear, however, that the desired 
statistics could be collected if sampling techniques were employed. 

Careful consideration was given to various alternative sampling pro- 
cedures. The suggestion to have several different sets of questions asked 
of different sectors of the population in rotation, and the proposal to 
sample on a representative area basis were quickly discarded because 
of the difficult administrative and technical problems involved. It was 
finally decided that the most feasible general approach, that suggested 
by Dr. Leon E. Truesdell, would lie in the selection of a small sample 
of the population to be asked additional questions in the course of the 
complete canvass. Some consideration was also given to the possibility 
of a follow-up enumeration of a sample so selected, but this alternative 
had to be dropped because of time and cost limitations. The actual 
technique devised for the selection of a representative sample during 
the course of the complete canvass, the manner in which bias arising 
from the use of a line schedule was minimized, and the type of schedule 
utilized which combined on one sheet of paper both the complete and 
the sample census inquiries, are described in some detail in a paper 
entitled “On the Sampling Procedure of the 1940 Population Census,” 
by Frederick F. Stephan, W. Edwards Deming, and Morris Hansen in 
the December, 1940, issue of ‘this JournaL. The development of the 
techniques employed in the collection of the 1940 census sample data 


* A paper presented at the 102nd Annual Meeting of the American Stat‘stical Association, Chicago, 
December 28, 1940, brought up-to-date and coordinated with the other papers of this session. 
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represents an interesting example of the application of mathematical 
statistics to a problem beset with administrative and practical diffi- 
culties. 

It may be worth while to examine the type of census questions which 
best lent themselves to sample treatment. It is apparent that certain 
types of inquiries, such as those relating to the total number of persons 
in various small communities, and their classification by age, sex, and 
color, must necessarily be collected on the basis of a complete canvass. 
Such basic information is desired and needed for political and adminis- 
trative units that are too small for the application of sampling tech- 
niques and is, in fact, required by law. In general, when the census 
inquiry is needed for the purpose of an inventory, a complete canvass 
is the only solution. This means that inquiries relating to information 
deemed necessary in detail for small communities were not amenable 
to sample investigation. 

Fields of study, however, in which the primary objective is the ascer- 
taining of general relationships, or obtaining information needed only 
for large geographical areas, for studies of various economic and social 
relationships, and for recommending courses of action, are ideally 
suited for sample treatment. 

The inquiries included in the sample portion of the 1940 population 
schedule are of this character. They can be broadly grouped into three 
classes. 

In the first class are inquiries relating to subjects of greater impor- 
tance in the past than at present, which were included to provide sta- 
tistics permitting the continuation of time series. In this class are the 
inquiries relating to nativity of parents and mother tongue. With the 
virtual cessation of immigration during the past decade, and with the 
dwindling of the number of foreign-born persons in this country, these 
questions, in light of the demand for more pressing inquiries, could not 
be carried on the population schedule for the complete canvass. The 
collection of information relating to these items, on a sample basis, 
however, will provide data adequate for the continuation of the his- 
torical series derived from previous censuses. 

In the second class are the inquiries designed primarily to obtain 
information for administration or for the formulation of administrative 
or legislative policy. In this class are the inquiries relating to social 
security coverage and to veterans’ status. The Congress of the United 
States, the Social Security Board, the Veterans Administration, and 
many other agencies are vitally interested in these inquiries on the 
population schedule. The facts collected on a sample basis, which other- 





>, as, 4 tne 





-Tue Use or SAMPLING IN THE CENSUS o7l1 


wise could not have been obtained, will be of considerable importance 
in determining legislative and administrative policy affecting the wel- 
fareof millions of citizensand theexpenditure of literally billions of dollars. 

Finally, in the third class are inquiries of general economic and social 
significance, the primary objective of which may be termed scientific, 
that is, the determination of broad relationships and the general illumi- 
nation of problems to which they are addressed. Such inquiries, it may 
be added, will, in the long run, also have important administrative and 
legislative import. In this class are the census questions relating to 
fertility and usual occupation. The purpose of the fertility questions is 
not to obtain small area statistics but rather to ascertain and measure 
the important factors associated with fertility differentials. The purpose 
of the questions relating to usual occupation is to determine, in general, 
rather than in specific small localities, the extent to which persons are 
at work at “distress” occupations. There is no question but that these 
purposes will be served satisfactorily by the sample data collected. 

The introduction of sampling techniques in the collection of data, it 
is clear, has made it possible greatly to extend the scope of the census 
undertaking within the restricted allotment of time and money. The 
utilization of sampling procedures resulted in a tremendous increase in 
the amount of information collected for the use of Government and the 
public at large, with no appreciable loss in the accuracy of the data 
and with no increase in the cost of the census enterprise to taxpayers. 


QUICK PRELIMINARY RETURNS 


A second important use of sampling in the 1940 Census of Popula- 
tion lies in the preparation of a sample set of punch cards (the “S” 
Cards) containing selected information which had been collected on a 
complete canvass basis. These sample punch cards were designed to 
make available as early as possible a limited amount of information on 
subjects concerning which timeliness was of the essence. It is not 
usually realized that even with the most efficient of modern equipment, 
more than a full year must elapse before the approximately 132,000,000 
names on the population schedules can be processed, cards punched, 
and the first tabulations completed. The preliminary returns of employ- 
ment and unemployment and of the age, sex, and color composition of 
the population, which were released early in 1941, subjects of vital im- 
portance to many agencies of Government and to the public at large, 
were made available a full eight months in advance of their complete 
tabulation; and thus a vital public need for quick results was met 
through utilization of sampling techniques. 
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CROSS-TABULATION 


A third and, from a scientific standpoint, a very important use of 
sampling is to be found in plans for detailed cross-tabulations of data 
designed to reveal general relationships. It is financially impossible, 
with current appropriations, to cross-tabulate, for example, such items 
as occupation, by education, age, sex, color, and wage income for the 
entire labor force. Such a tabulation would involve, on a 100 per cent 
basis, the cross-sorting and tabulation of more than 50,000,000 cards. 
This tabulation, on a sample basis, can, however, produce adequate 
results for the determination of relationships for the country as a whole 
or for large geographic subdivisions. It is obvious that the elimination 
of one or two runs of as many as 50,000,000 cards provides sufficient 
funds for practically all of the runs necessary for rather thorough cross- 
tabulation of a sample of 2,500,000 cards. 

In addition to the sample card, referred to above, to be used only for 
quick preliminary returns, present census plans provide for four other 
sample cards. The “Supplementary Individual Card” (the “B” Card) 
will be punched for approximately 6,600,000 individuals and will in- 
clude for these persons most of the information collected on a complete 
census basis as well as information collected on a sample basis. The 
“Fertility Card” (the “C” Card) will be punched for approximately 
two and one-half million women 15 years of age and over and will in- 
clude practically all of the pertinent information collected on a 100 per 
cent basis as well as the sample fertility information. The “Sample 
Family Card” (the “D” Card) will be punched for approximately 
1,750,000 heads of families who fall into the 5 per cent sample. This 
card will include information relating to the families of the heads 
collected on the complete census basis as well as certain sample informa- 
tion. A fourth card (Card “F”), devoted to bringing together housing 
and family information, was originally planned for complete coverage, 
but, because of limitation of funds, will be punched and tabulated on 
a sample basis. 

Finally, with respect to problems of tabulation, it should be men- 
tioned that additional opportunity for sample tabulation lies in the fact 
that all of the complete census individual cards, that is, the “A” Cards 
(of which there are approximately 132,000,000) have been so punched 
as to identify the persons in the 5 per cent sample. Thus, it will be 
possible to make sample tabulations of the more detailed complete 
census information on this punch card; and, in addition, and not with- 
out considerable importance, it will be possible to provide storage for 
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the relatively small number of sample cards for a considerably longer 
period than for all of the cards. 

The possibilities of cross-tabulation of these various sample cards are, 
of course, almost unlimited, and even with the relatively low costs in- 
volved, it will not be possible completely to exhaust during the census 
period the mine of information which they contain. The introduction 
into the census program of sample tabulations, however, will make 
available a considerable mass of data which otherwise could not pos- 
sibly have been produced. 


CLERICAL OPERATIONS 


A fourth and by no means unimportant use of sampling methods in 
the 1940 Census lies in their application to operation problems—prob- 
lems of quality and production control. The purpose of using sampling 
methods in the various processes required to convert the population 
schedule into a census volume is primarily to save time and money. 
The importance of such a saving is realized when it is pointed out that 
approximately twenty-two million dollars of the total appropriation of 
twenty-six million dollars for the population and housing censuses is 
required for the collection of the information and the preparation of 
punch cards. Only four million dollars, or about 15 per cent of the total 
appropriation, remains for tabulation and publication of the data. 
A relatively small proportionate saving in the manual work required up 
to the point of completing the punch card, it is clear, can result in large 
proportionate increases in the tabulation and publication program. 

Long years’ experience have demonstrated that editing, coding, 
punching, and other clerical operations involved in the treatment of 
mass schedule data must be verified if the resulting statistics are to 
attain the required degree of accuracy. The verification of these various 
clerical operations, however, is almost as costly as the original opera- 
tion. Sample techniques have been used, therefore, wherever possible in 
the verification of editing, coding, and punching operations. A record 
system was devised permitting the determination of the minimum 
amount of verification required to guarantee that the desired qualita- 
tive standards were maintained. A detailed treatment of the methods 
employed is given in the paper by Dr. W. Edwards Deming (pp. 351- 
360 of this issue). The material savings resulting from the application 
of sampling techniques to production problems, it is expected, will go 
a long way toward making possible a considerable increase in the tabu- 
lation and publication program of the population and housing censuses. 
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CENSUS VS. SAMPLE DATA 


At this juncture, it is perhaps fitting to deal briefly with two types 
of criticism frequently leveled at sample studies to the effect that (1) 
samples cannot be relied upon, and (2) that small numbers severely 
restrict the amount of detail which can be made available. 

The first type of criticism usually emanates from the layman rather 
than from the statistician, but, unfortunately, the term “layman” fre- 
quently includes members of the learned professions who have not been 
exposed to statistical education or experience. 

The fact is that a sample is quite adequate for ascertaining general 
relationships in the realm of economic and social phenomena and for 
recommending courses of action where administrative or legislative 
problems are involved. For such purposes a complete or 100 per cent 
canvass is frequently no better than a statistically adequate sample 
and may actually, under some circumstances, be much worse. Even a 
complete census provides only a sample of the characteristics of the 
population that could be obtained at different points of time, and re- 
sults in data which at any one time are determined by actions of chance 
as well as underlying social and economic causes. 

In the realm of social statistics, we make a cross-section statistical 
survey on a complete or a sample basis and assume that the relation- 
ships found continue to hold in the future. When a course of action is 
based upon a statistical study, the inferences drawn, whether they are 
based on complete or partial coverage, are subject to the common error 
which may arise from the fact that the social order is not static but 
dynamic. A sample coverage differs from a complete coverage only 
quantitatively and for most practical purposes the error which lies in 
predicting the future from the past, even if such predictions are based 
on a complete census, far exceeds the variance which results from 
sample coverage. In fact, it may be argued that the complete canvass 
may actually be worse than the sample as a basis for action, or for the 
prediction of the future, because usually a longer period of time must 
elapse before the results are at hand. To the extent that sampling 
methods make results available more quickly, the sample data may 
provide a much more nearly accurate basis for social action than the 
tardy returns of a complete census. This subject has been dealt with 
more fully in a paper by W. Edwards Deming and Frederick F. 
Stephan, entitled, “On the Interpretation of Censuses as Samples.”! 

Finally, it is frequently argued that small numbers may set severe 


1 This Journnat, March 1941, pp. 45-49. 
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limits upon the amount of detail which can be made available through 
sample cross-tabulation. It is in order, however, to note that limits will 
also eventually be reached in the tabulation of data based on a com- 
plete canvass. In general, information which cannot be obtained from 
samples as large as those represented by the samples to be used in the 
1940 Census of Population is the kind of information that should be 
secured through case study rather than through statistical methods. 


Sampling techniques have been utilized in a number of other ways in 
the performance of everyday tasks in the conduct of the census. Numer- 
ous small sample studies have been made from time to time which 
served as bases for decisions made in the design of the schedule, the 
preparation of instructions, the planning of tabulations and other mat- 
ters. The introduction and successful utilization of sampling techniques 
in the 1940 Census of Population will, in my judgment, stand out asa 
tribute to the administrative foresight of William Lane Austin, Director 
(until his retirement at the end of January, 1941), and Dr. Vergil D. 
Reed, Assistant Director, of the Bureau of the Census; and to the tech- 
nical leadership and vision of Dr. Leon E. Truesdell, Chief Statistician 
for Population. 

It is to be hoped that the use of sampling techniques in the 1940 
Census will serve as precedent for future censuses, and that new ex- 
tensions and use of sampling methods will follow to the benefit of not 
only professional statisticians and social scientists, but also to the 
benefit of governmental and private consumers of statistics and to the 
public at large. 











GENERAL POPULATION STATISTICS* 


By Henry S. Suryrock, Jr., Statistician, Population Division, 
Bureau of the Census 


HE TERM “general population statistics” as here used covers the 
aaan and distribution of inhabitants; such primary personal 
characteristics as age, nativity, and education; fertility; and families. 
It will not be possible in the space allotted to summarize the kinds of 
tables on these subjects that are planned for various Census Bureau 
publications. Instead, an attempt will be made to highlight the impor- 
tant aspects of the vast number of tabulations that are scheduled for 
the next few years. Stress will be laid on new types of data, new cross- 
classifications, and extensions of data to new types of areas, particu- 
larly as they provide a possibility of answering demographic and social 
questions that are now largely subject to conjecture. 

First, however, reference should be made to the speed with which 
census data can be made available to the public. With an entirely un- 
precedented housing census, a much larger population schedule than 
ever before, and an ambitious publication program, the Bureau could 
be kept busy for years just in the processing of schedules and the 
preparation of tables and analytical studies. Obviously, many of the 
data would lose their timeliness and much of their significance if the 
public could not get a glimpse of them until they were in final, polished 
form. Accordingly, at every possible point the Bureau has planned to 
issue preliminary reports presenting the broad outlines of the subject 
matter. 

The determination of the number and geographic distribution of in- 
habitants is the primary concern of the Population Census, and these 
figures must be submitted to Congress by a fixed date. The total for 
each political unit is established by a hand count long before a card 
could be punched for each individual. The results of the hand count 
appeared in condensed form in state releases. They are now appearing 
in full detail in printed First Series State bulletins, which will finally 
be bound together to form Volume I of the Population Reports. 

In the past, figures on general population characteristics first became 
available in a second series of state bulletins and in the releases based 
upon them. Now, however, a Preliminary Sample Card, punched for 
every twentieth person, has been yielding statistics on the composition 
of our population by age, sex, race-nativity, and type of residence 


* A paper presented at the 102nd Annual Meeting of the American Statistical Association, Chicago, 


December 28, 1940, brought up-to-date and coordinated with the other papers of this session. 
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(urban, rural-nonfarm, or rural-farm). The primary demographic 
trends of the decade are already observable for the country as a whole 
and for each state. From these figures rough estimates of net interstate 
migratory movements and of net reproduction rates have been com- 
puted. 

In sum, at practically every stage of the tabulation program, special 
releases of the salient figures are being issued in advance of the detailed 
printed tables. Last of all will come the analytical monographs, the 
more involved cross-classifications of variables with an accompanying 
interpretative text on such broad subjects as differential fertility, edu- 
cational status, racial and ethnic groups, and metropolitan districts. 

As is mentioned in the article by Dr. Truesdell (pages 361-368 of this 
issue) census tracts and metropolitan districts have been given a promi- 
nent place in the tabulation program for the 1940 data. The value of 
permanent, small, homogeneous areas like census tracts has become in- 
creasingly apparent for urban social research, marketing analysis, and 
municipal planning. Accordingly, the number of inhabitants by tracts 
is being given in the first series of state bulletins and the composition 
and characteristics of tracts will appear in separate reports for each 
tracted area. The same characteristics will be tabulated for census 
tracts as for counties so that, for example, we shall have school attend- 
ance by age and the highest grade of school completed by persons 25 
years old or over, for each of four race-nativity groups. The metro- 
politan districts will figure in even more publications, including those 
on migration. 

Dr. Truesdell’s article also describes the added detail beyond that 
shown in 1930 that will be published for townships and other minor 
civil divisions, incorporated places of 1,000 to 2,500 inhabitants, and 
counties. Furthermore, one of the most important new items on the 
population schedule, the highest grade of school completed, will be 
made available for all persons 25 years old and over in counties and 
their rural-nonfarm and rural-farm parts. 

The subject of education is deserving of special mention. For the 
first time we have the highest grade of school completed by each mem- 
ber of the American population. This item is not only valuable as a 
measure of the formal educational attainments of our people, but it 
will be useful in many cross-classifications with otier variables. Unlike 
other indices of social-economic status, it is applicable to everyone. 
For example, the general fertility of women at different educational 
levels may now be examined. Formerly, the analysis of differential 
fertility by social class was restricted to wives of heads of households 
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or to married women. Cross-tabu!ation of education with the new ques- 
tion on internal migration will enable us to inspect on a nation-wide 
scale the selections involved in the migratory process. 

Some of the items carried on the 1930 population schedule were ob- 
tained in 1940 only for the five per cent sample, for reasons of economy. 
(For the nature of this sample, see the article by Dr. Hauser on pages 
369-375 of this number.) A special card is being punched to provide 
for the tabulation of such items as birthplace of parents, mother 
tongue, and veteran status. Thus it will be possible to get very close 
estimates of the number of native persons of native parentage, of mixed 
parentage, and of foreign parentage. Such items may be cross-classified 
with age, sex, race, citizenship, and education, which are also on this 
card. 

The 1940 Census will give more attention than any previous one to 
the subject of human fertility. Demographers, who are keenly aware of 
our people’s declining reproductivity and of the social and economic 
implications of known fertility differentials, have been forced to make 
ingenious use of inadequate study materials. Now they will have a spe- 
cial 45-column card dealing solely with this subject. This card will be 
punched for about every twenty-fifth woman 15 years old or over. The 
two main measures of fertility on the card are the number of the 
woman’s children under age 5 in the household and the number of chil- 
dren she has ever borne. The first will be useful in approximating very 
nearly the net reproduction rate for women in a wide variety of groups. 
The second may be used to study the incidence of childlessness and the 
size of completed families. Another index is the number of the woman’s 
children aged 5 to 9 in the household. By comparison with the number 
of children under age 5, the fertility in the two halves of the last decade 
may be contrasted, and by adding the children under 5 and 5 to 9 it 
may be possible to obtain a measure roughly comparable with the 
number of children under age 10 that was tabulated for the East North 
Central States in 1930.! 

A considerable number of demographic, economic, and socia! charac- 
teristics for the woman and for her husband, if she has one, will be 
available for cross-tabulation. Included will be such factors as equiva- 
lent monthly rental, wages of husband, wages of family, presence or 
absence of nonwage income, weeks worked during the year by husband. 
occupation of husband, work status of husband in the week preceding 
the census date, work status and occupation of the woman in the week 

1 Notestein, Frank W., “Differential Fertility in the East North Central States: A Preliminary 


Analysis of Unpublished Tabulations from the Family Cards of the 1930 Census,” Milbank Memorial 
Fund Quarterly, 16 (2), pp. 173-191. April, 1938. 
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preceding the census date, education of woman, education of husband, 
migration of woman during the period 1935 to 1940, birthplace of 
woman, birthplace of parents of woman, mother tongue of woman, age 
of husband, and, of course, age of woman, marital status, age at mar- 
riage, and whether or not first marriage. By means of these last four 
items the duration of marriage may be controlled. It is thus apparent 
that gross and net reproduction rates for social-economic classes, as well 
as many other indices of fertility, may be derived. 

It will still be possible to compute ratios of children under 5 to women 
of childbearing age on a 100 per cent basis for geographic areas. These 
will be necessary for populations too small safely to be represented by a 
five per cent sample. Such ratios will also allow continuity with data 
from earlier censuses. Continuity is further provided for in the form of 
a special tabulation from a comparable sample of the 1910 schedules. 
Both the number of each woman’s children under age 5 and the number 
of children ever born may be obtained from these schedules.” 

Finally, a comparison is being made of the Infant Card filled out by 
the enumerator for each child born between December 1, 1939, and 
April 1, 1940, and the birth certificates of this period. It should thus be 
possible to compute directly the amount of underenumeration of very 
young children and the amount of underregisiration of births. Neces- 
sary corrections to the various measures of fertility may then be made. 

The 1940 Census will provide statistics on both households (groups 
of persons sharing common living quarters and housekeeping arrange- 
ments) and on families (the persons related to the head of the house- 
hold). Whereas in 1930 there was essentially only one 24-column card 
for households, families, and fertility, there are now two 45-column 
cards for families and households alone. For about every twenty-fifth 
family and household, there will be two cards, one dealing with general 
characteristics and housing data and the other more with economic 
characteristics. It is also planned to secure a few tabulations for all 
households according to some simple characteristics of the head. These 
tabulations will be based on the cards for individual persons, where 
there is an indication of whether or not they are the head of a house- 
hold. 

The first of the two household cards mentioned just above carries 
such items as the tenure of the home, its value or rental, the number of 
persons per room, and other housing information; the number of re- 
lated persons in the household, the number of children under age 21 in 


2 It may be recalled that the number of children ever born was collected in the Censuses of 1890, 
1900, and 1910. This item was never tabulated except on a limited sample basis for 1910. 
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the family, the number of related persons in the labor force, together 
with the number of these employed, on emergency work, or seeking 
work; the wage income of the family and whether or not the family had 
income from sources other than wages. In addition to these facts there 
are such characteristics of the head as sex, race, nativity, age, marital 
status, migration, employment status, occupation, and weeks worked 
during 1939. The presence or absence of a sub-family in the household 
is indicated, and special information is shown on hotels and institutions 
(quasi households). 

The second household card includes among other things the number 
of children under age 10 and under age 18, the number of children aged 
14 to 17 in the labor force, and the number of persons aged 65 or over; 
the number of wage earners and the wage income of the first and of the 
second earner in the family; the education and Social Security status 
of the head; and the age, employment status, and occupation of the 
wife. Many items are common to both cards, but rather different cross- 
classifications are possible. Tabulations from the two cards should yield 
figures of great value in marketing analysis, city planning, social work, 
and social research. 

Some months ago, three small summary releases were issued showing 
the 1930 distribution of families according to type. Such types were 
given as families with head and wife living together with and without 
children and families with a female head. Enough interest has been 
expressed in this classification of families to warrant its inclusion in the 
1940 family volume for the larger areas. 

From an enlarged population schedule and the use of supplementary 
questions for a sample of the population, the census has gathered in- 
formation on more demographic and social phenomena than ever be- 
fore. These facts are being punched on 5 main cards, each of 45 columns, 
as compared with 3 main cards in 1930, each of only 24 columns. A con- 
tinual flow of multilithed summary releases has already begun to bring 
the results to the consumer. These will be followed by published vol- 
umes presenting the statistics in more detail and, finally, by research 
monographs containing analysis and interpretation. 








EMPLOYMENT AND INCOME STATISTICS* 


By A. Ross Ecker, Chief, Employment and Income Statistics 
Population Division, Bureau of the Census 


HE ADDED EMPHASIS on economic problems and particularly those 
Th edeties to the national labor force is one of the most significant 
aspects of the 16th Decennial Census of Population. Although some of 
the new questions have been included primarily for purposes of ad- 
ministrative research in particular Federal programs such as Social 
Security, the real occasion for most of them came from recognition of 
the general need for data bearing upon the problems of large-scale un- 
employment and underemployment and of insecurity resulting from 
irregular earnings. 


BASIS OF CLASSIFICATION OF LABOR FORCE 


The 1940 Census will provide for the first time a complete classifica- 
tion of all persons fourteen years of age and over, according to their 
activity within a single week (March 24-30, 1940). Persons in the labor 
force during that week will be subdivided into four classes: those at 
work on private or nonemergency government jobs; those on emer- 
gency work; those seeking work and without any form of employment; 
and those with a job from which they were temporarily absent because 
of vacation, temporary illness, weather conditions, short layoff, or in- 
dustrial dispute. Persons not in the labor force will be classified as 
engaged in own home housework, attending school, unable to work, in 
specified types of institutions, and outside the labor force for other 
reasons, such as possession of independent means or seasonal inactivity. 

The basis for deciding whether a person was in the labor force differed 
in two important respects from that used in the 1930 and earlier cen- 
suses. In the first place, the classification depended upon actual activity 
(i.e., working or actively seeking work) which was believed to be a more 
definite and a more objective criterion than certain others, such as 
wanting to work, willingness to work, or ability to work. 

In the second place, the classification referred to a specific time, 
namely, the week of March 24-30, 1940. The choice of a definite time 
period was based upon both logical and practical considerations. Since 
the active labor force varies considerably in response to cyclical and 
seasonal changes, it cannot be defined except in terms of some specific 
period, such as a week, a year, or a number of years covering the various 
phases of a business cycle. The practical difficulties of applying the 


* A paper presented at the 102nd Annual Meeting of the American Statistical Association, Chicago, 
December 28, 1940, brought up-to-date and coordinated with the other papers of this session. 
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concept increase greatly as the period is lengthened. It is relatively 
much simpler for an enumerator to make his classification on the basis 
of activity in a single week than on the basis of participation over a 
period of a year or longer. 

It is important that users of 1940 census data realize the difference 
between the new concept of the labor force determined by activity 
within the census week and the 1930 concept of gainful workers. This 
change will affect the size of the labor force and its characteristics, and 
hence must be taken into account whenever the 1940 labor force data 
are compared with data from previous censuses. The “gainful worker” 
group, to which many tabulations of 1930 and of earlier censuses ap- 
plied, included all persons who returned a gainful occupation, that is, 
an occupation by which they earned money or a money equivalent, or 
in which they assisted in the production of marketable goods. The 
gainful worker group thus included not only those actually at work on 
the census date, but also those who usually pursued gainful occupa- 
tions. It is possible that certain classes of persons, such as retired 
workers, some inmates of institutions, recently incapacitated workers, 
and seasonal workers neither working nor seeking work at the time of 
the census, were frequently included among gainful workers in 1930, 
but, in general, not in the 1940 labor force. On the other hand, new 
workers, a group numerically unimportant in previous censuses, but of 
considerable significance at the present time, were not included in gain- 
ful workers but are in the labor force of 1940. Furthermore, some per- 
sons with unknown occupation and industry, who could not have been 
counted as gainful workers in previous censuses, may be included in 
the 1940 labor force on the basis of evidence regarding their work 
status. : 

Differences in age limits slightly affect comparability between the 
total number of gainful workers in 1930 and the number in the labor 
force in 1940. All persons under 14 are classified as outside the labor 
force, but in the 1930 and previous censuses, persons in the 10 to 13 
year class were counted as gainful workers if they reported gainful 
occupations. The group of workers from 10 to 13 years of age has be- 
come so unimportant numerically as no longer to justify the additional 
burden of enumeration and tabulation necessary for the retention of 
the former 10-year age limit. 


EMPLOYMENT STATUS CLASSIFICATION 


It is generally recognized that no sharp line should be drawn between 
employment and unemployment, because gradations in hours of work 
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and remuneration make certain types of employment almost equivalent 
to complete unemployment. In recognition of the importance of part- 
time employment or underemployment, two new questions on hours of 
work and wage income were included in the 1940 census schedule. In 
addition, information concerning the amount of employment during the 
preceding year is provided by a question on the number of weeks of 
work in 1939. 

The 1940 labor force tabulations will differ from the gainful worker 
tabulations of previous years in that most of the tabulations of labor 
force data will separate (1) persons in private or nonemergency govern- 
ment work or with a job, (2) persons on public emergency work, and 
(3) persons without any form of employment who were seeking work. 
The separation of these groups in occupation and industry tables is 
important because the present or latest occupation and industry have 
one meaning for a person at work or with a job, another meaning for 
one on emergency work, and still another for one seeking work. 

Employment status in the week of March 24-30, 1940, thus consti- 
tutes the primary basis for classifying the labor force, even though the 
total number of persons in each of the employment classes, as well as 
their distribution by hours of work or duration of unemployment or by 
occupation and industry, have probably changed considerably since 
March, 1940. The continuation of general patterns of relationship 
justifies detailed tabulations of the labor force data despite consider- 
able changes in the absolute figures. 

Large-scale unemployment has been a problem of such long duration 
that we should present all available information on the types of persons 
affected, the number and characteristics of their families, and the occu- 
pational and industrial groups in which part-time work or complete 
lack of employment is most frequent. The tabulations for emergency 
workers and persons seeking work will reveal significant relationships 
between the lack of employment and such characteristics as age, sex, 
color, occupation, and industry. The figures on duration of unemploy- 
ment and on weeks of work and amount of wages and salaries in 1939 
will provide further information about the problems arising from in- 
ability to find employment. Such data will be valuable even though 
unemployment continues to decline, and will be vitally important in 
any future period of large-scale unemployment. 

Although the tabulation plans are not final at the time of writing, 
probably the most extensive tabulations will pertain to persons em- 
ployed in private or nonemergency government work, who comprise 
over 85 per cent of the total labor force. The tabulations for states and 
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large cities will show the interrelationships among age, sex, color, occu- 
pation, industry, class of worker, hours of work during the census week, 
and weeks of work in 1939. Such tabulations are expected to shed light 
upon the incidence of part-time employment. They will be valuable, 
moreover, in providing the necessary data for tying monthly employ- 
ment series to the census base. 

In addition to these general-purpose tabulations of the labor force, 
certain special tabulations will be made to clarify the composition of 
the various classes in the labor force and outside. Among the groups to 
which particular attention will be given are those “with a job,” persons 
not in the labor force for reasons other than home housework, school 
attendance, or disability, and persons for whom employment status was 
not reported. In addition, tabulations will be planned wherever possible 
to reveal the nature of the misclassification of public emergency workers 
revealed by a preliminary tabulation of the five per cent sample. 

Considerable emphasis will be placed upon the class-of-worker group- 
ing because most of the measures of employment and income are much 
more significant for wage and salary workers than for employers, own- 
account workers, and unpaid family workers. Moreover, certain govern- 
ment programs, such as unemployment compensation and old-age in- 
surance, apply only to wage workers. 


INCOME DATA 


The questions on income were probably more widely publicized than 
any others on the schedule. These questions were designed to ascertain 
the amount of cash wages and salaries received during 1939 by each in- 
dividual 14 years of age and over, and whether $50 or more was re- 
ceived in nonwage income. The data will be valuable in estimating the 
total national income and its distribution, figures which are becoming 
increasingly important in the measurement of national well-being. 
Analysis of the wage and salary data with reference to the age, sex, 
color, weeks worked, and occupation of the recipients will provide ex- 
tensive information on income differentials and opportunities for earn- 
ings. The data will also be of great value to market analysts, who have 
heretofore been forced to use a variety of incomplete and indirect 
measures to estimate the relative purchasing power of various parts of 
the population. 

The wage and salary data, taken in conjunction with figures on 
rentals and values of homes, may make it possible to estimate the dis- 
tribution of nonwage income for those reporting such income. Such 
estimates may provide the basis for estimates of total income to supple- 
ment figures available from the federal income tax returns. 
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The additional questions asked of the five per cent sample of the 
population, which is described on pages 369 to 375 of this issue of the 
JOURNAL, include an inquiry regarding the possession of a Social 
Security number, and the proportion of salary or wages from which 
deductions were made for Old-Age and Survivorship Insurance or Rail- 
road Retirement. Such information will considerably enhance the use- 
fulness of the census data on wage and salary income, since it provides 
a means for tying together two major sources of economic materials, 
the decennial census with its complete coverage of the labor force and 
the current wage account records of the Social Security Board. 


EMPLOYMENT AND INCOME DATA FOR FAMILIES 


The funds available for the transcription and tabulation of family 
data have been seriously reduced by unavoidable increases in the cost 
of certain preceding portions of the census program. Consequently, it 
has been necessary to plan this part of the program on a sample basis. 
The precise types of tabulations that are feasible and the areas for 
which they can be shown will be dependent upon the size of the sample. 
In any case, however, the employment and income data for families 
will provide important information on the problems arising from unem- 
ployment or inadequate employment. Since the family is the primary 
consuming unit, the employment status of the workers in the family 
and their combined wage income are significant in analyzing the im- 
pact of unemployment upon family welfare and buying power. It is 
generally recognized that the seriousness of the loss of a job depends 
upon the presence or absence of other family members who may con- 
tribute to the family’s support. 

It is hoped that families may be classified on the basis of the number 
of members in the labor force; the employment status of the family 
head and wife, as well as of secondary workers, including minor chil- 
dren; duration of unemployment in families with no member at work; 
and total wage and salary income. Unless the sample is extremely small, 
the family data on income will be useful to marketing agencies, since 
the family is the buying unit for the majority of consumer goods. 


TABULATION PROGRAM: AREAS AND COMPLETION DATES 


The majority of the labor force tabulations will be presented for 
large areas, that is, for states and large cities. Certain tabulations will 
be based upon the complete population and others, for which less geo- 
graphic detail is necessary, will be based upon the five per cent sample. 
The various large-area tabulations of data for individuals will begin to 
appear in the autumn of 1941 and should be completed early in 1942. 
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Labor force data are included in three other parts of the census tabu- 
lation program. Information on employment status by sex, color, age, 
and residence was obtained from a preliminary tabulation of the five 
per cent sample and issued in a series of releases beginning in January 
of this year. Data on the employment status of persons 14 years old 
and over and on the distribution of the employed labor force by broad 
occupation and industry groups wili be presented for small areas 
(counties and incorporated places of 10,000 and over, with less detail 
for parts of counties and smaller incorporated places) in the Second 
Series Population Bulletins, to be issued state by state during the sec- 
ond half of 1941. Tabulations of labor-force data for families will not 
begin to appear before the first quarter of 1942. 

One feature of the 1940 tabulation program, the increased relative 
emphasis upon metropolitan districts, is of particular importance in 
connection with the labor iorce data. For most labor market summaries 
the metropolitan district is a more significant unit of analysis than the 
central city. In fact, the characteristics of the labor force residing in 
the central city may differ so substantially from those of workers in the 
whole district that a tabulation showing relationships for the central 
city alone may provide a misleading picture of the complete labor 
market. 




















OCCUPATION AND INDUSTRY STATISTICS* 


By Asa M. Epwarps, Chief, Occupation and Industry Statistics 
Population Division, Bureau of the Census 


ENSUS OCCUPATION statistics, as distinguished from industry sta- 
C tistics, were first published in 1850. The 1940 statistics will thus 
round out a complete century of census occupation statistics. One 
might suppose, therefore, that the 1940 Census would have no impor- 
tant new features. But it will have several. 

The most outstanding new features of the 1940 occupation census 

are: 

1. The basic change from the 1930 concept of “gainful worker” to 
the 1940 concept of “labor force.” 

2. The new occupation classification. 

3. The new industry classification. 

4. The’ publication of class of worker statistics, and of statistics re- 
lating to new workers. 

5. The proposed new tabulations. 


The change from the 1930 concept of “gainful worker” to the 1940 
concept of “labor force” is discussed on page 382 of this JoURNAL. 


THE NEW OCCUPATION CLASSIFICATION 


Users of the occupation statistics published by the different Federal 
agencies have long felt the need for standardization in the classification. 
The lack of such standardization in the past has often made it impos- 
sible to compare the occupation statistics published by the different 
agencies. In 1938, the American Statistical Association and the Central 
Statistical Board appointed a Joint Committee on Occupational Classi- 
fication to devise a standard classification. This Committee was com- 
posed of representatives of a number of government agencies, and 
representatives of the American Statistical Association and the Central 
Statistical Board. The Committee and its Technical Subcommittee 
formulated, during 1938 and 1939, a Standard Convertibility List of 
Occupations.' The Standard Convertibility List, as the title implies, is 
not, primarily, a classification scheme. It represents, rather, a meeting 
ground on which differing classification schemes can be reconciled, a 
basis on which the occupation statistics of the different agencies can be 
compared. 

* A paper presented at the 102nd Annual Meeting of the American Statistical Association, Chicago, 


December 28, 1940, brought up-to-date and coordinated with the other papers of this session. 
1 Presented in this JournaL, December 1939, Vol. 34 pp. 693-707 by Gladys L. Palmer. 
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The 1940 census occupation classification conforms in large measure 
to the Standard Convertibility List. The arrangement of the census 
classification differs somewhat, however, from the arrangement of the 
Convertibility List, and a considerable number of the composite occu- 
pation groups of that List have been subdivided in the census classi- 
fication. Through such subdivisions, the 327 occupations and occupa- 
tion groups of the Convertibility List have been increased to 451 in the 
census classification. The increase has consisted, principally, in the 
further subdividing, by industry, of “Proprietors, managers, and offi- 
cials”; “Foremen”; “Inspectors”; “Operatives and kindred workers”; 
and “Laborers, except farm.” The subdivisions of two groups of the 
Standard Convertibility List, namely, “Farmers” and “Porters,” are 
not included in the census classification. 

In the 1940 census classification, most of the occupations are “re- 
peaters,” that is, occupations which are repeated in a number of in- 
dustries or in all industries. This makes the classification more nearly 
accurate than was the 1930 classification, and it makes possible a more 
nearly accurate classification of occupations by industry. 


“ 


MAJOR OCCUPATION GROUPS 


In reporting the occupations of the gainful workers of the United 
States, the Bureau of the Census has customarily grouped the occupa- 
tions into a few large industrial divisions, as Agriculture, Manufactures, 
etc., and has classified each occupation in that industrial division in 
which the occupation is most commonly pursued. No attempt has been 
made by the Bureau of the Census, in its decennial Census Reports, to 
group the gainful workers according to large occupation groups, or ac- 
cording to social-economic groups or strata. Yet, there is a real need for 
such an additional grouping. In many present day studies, we wish to 
deal with large social-economic groups, such as professional workers, 
clerical workers, skilled craftsmen, etc., with but minor regard to the 
particular occupations pursued by the workers, or to the section of the 
industrial field in which they are employed. More and more statisti- 
cians have come to realize the significance of statistics which distribute 
the labor force into large social-economic groups. In response to this 
growing recognition, the occupations of the Standard Convertibility 
List were arranged into nine major occupation groups. In the 1940 
census classification, the number of major occupation groups was in- 
creased to eleven by changing “Farmers and farm managers” and 
“Farm laborers and foremen” from subgroups to major groups. The 
census groups are as follows: 
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1. Professional and semi-professional workers. 
a. Professional workers. 
b. Semi-professional workers. 
. Farmers and farm managers. 
. Proprietors, managers, and officials, except farm. 
4. Clerical, sales, and kindred workers. 
a. Clerical and kindred workers. 
b. Salesmen and saleswomen. 
5. Craftsmen, foremen, and kindred workers. 
6. Operatives and kindred workers. 
7. Domestic service workers. 
8. Protective service workers. 
9. Service workers, except domestic and protective. 
10 
11 


CW bo 


. Farm laborers and foremen. 
. Laborers, except farm. 


The substitution of major occupation groups for the industrial divi- 
sions which hitherto have formed the framework of our occupation 
classification is a real scientific advance. We shall still have large in- 
dustrial divisions, but they will form the framework of our industry 
classification only, and not the framework of our occupation classi- 
fication. 


THE NEW INDUSTRY CLASSIFICATION 


The 1940 Census Industry Classification,? for use in classifying 
workers industrially, is based on the “Standard Industrial Classifica- 
tion,”* which was prepared, during 1937, 1938, and 1939, under the 
auspices of the Central Statistical Board, by a committee composed of 
representatives of various government agencies. The Standard Indus- 
trial Classification was devised for the purpose of classifying industries 
on the basis of returns from establishments. The modification of the 
Standard Industrial Classification, being used by the Census in con- 
nection with the 1940 occupation classification, was made by the same 
interdepartmental committee that prepared the Standard Converti- 
bility List of Occupations, previously referred to. The modification 
consisted in combining the 1,411 industries of the Standard Industrial 
Classifics tion into 132 industries and industrial groups, in order to form 
a classification suitable for coding industrial information obtained from 
individual workers or members of their families. Since, with a few ex- 
ceptions, the combinations made were of consecutive titles, the Stand- 


? Presented in this Journau, March, 1940, Vol. 35, pp. 74-85, by Bruno Fels and P. K. Whelpton. 
3 Presented in this Journat, March, 1940, Vol. 35, pp. 65-73, by Vladimir 8. Kolesnikoff. 
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ard Industrial Classification is readily convertible to the industry 
classification formulated for use in connection with the Standard Con- 
vertibility List. This is the classification that is being used in connec- 
tion with the 1940 census occupation classification. 

The new occupation classification and the new industry classification 
were adopted by the census despite rather wide differences between 
these classifications and those followed at the 1930 Census. It is hoped 
that the great value of having standard classifications will compensate 
in large measure for the inevitable loss of complete comparability with 
the 1930 census statistics. Statisticians will welcome the fact that at 
this census, for the first time, the Census Bureau will use the same in- 
dustrial classification (in detail or in condensed or modified form) in its 
statistics of occupations, agriculture, manufactures, mines, and busi- 
ness; and they will also welcome the fact that this same industrial 
classification will be used by most of the other government agencies 
that publish industry statistics. 

The 1940 Census occupation and industry classifications are not 
ideal. They are neither technically exact nor scientifically accurate; but 
it is believed that they approach about as nearly to ideal classifications 
as is now practicable in the case of general-purpose classifications to be 
adopted and used, both by agencies, like the Census, that collect data 
through a house-to-house canvass by enumerators who usually do not 
see the workers enumerated, and also by agencies that collect data 
through personal interview with workers or with their employers. 

These new classifications represent a long stride toward uniformity 
in our statistics, but we should not overlook the fact that to have real 
uniformity we must have not only uniformity of classification but also 
uniformity of enumeration. Even with a standard classification, we 
cannot have a high degree of uniformity between statistics compiled 
from data collected in a house-to-house canvass and statistics compiled 
from data collected by direct personal interview with workers or with 
their employers. 

CLASS OF WORKER 


The phrase, “class of worker,” is here used to designate the classi- 
fication of workers according to position in industry, as: 


Wage or salary worker in private work. 
Wage or salary worker in government work. 
Employer. 

Persons working on own account. 

Unpaid family worker. 
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The Bureau of the Census collected information relating to the class 
of worker in 1910, in 1920, and in 1930. In 1910 and in 1930 the Bureau 
coded and tabulated the returns, but at each census an examination of 
the tabulated figures indicated that they were not sufficiently accurate 
to justify their publication. 

It was evident that class of worker statistics, especially those relating 
to wage and salary workers, would be of increased importance at the 
1940 Census, particularly in the analysis of the census unemployment 
and income data, and the data relating to the wage and salary workers 
covered by the Social Security law. Therefore, the Census Bureau again 
decided to collect data relating to class of worker, and it made a special 
effort, in formulating the schedule and in framing the inquiries and the 
instructions to enumerators, to secure more reliable information than 
had been secured at any of the three preceding censuses. We have coded 
the returns, and it is our belief that these returns are sufficiently ac- 
curate to justify their tabulation and the presentation of the resulting 
statistics. We, therefore, plan to include in our 1940 occupation reports 
tables showing workers in the different occupations and industries, 
classified by class of worker, correlated with classification by sex, color 
or race, and age. 

At the 1940 Census, as at each of the three preceding decennial cen- 
suses, the census enumerators, in their returns, frequently did not dis- 
tinguish carefully between employers and persons working on their own 
account. Because of this deficiency in the returns, the Census will 
tabulate as one group the returns for employers and those for persons 
working on their own account. This will not be as great a loss as it may 
seem to be, for the line of demarcation between employers and persons 
working on their own account frequently is not clear-cut and distinct. 
Indeed, a worker may be, and sometimes is, an employer one week, a 
worker oii his own account the next week, and a wage worker the third 
week. This is particularly true of such workers as carpenters in building 
construction work. 


NEW WORKERS 


Long years of depression, rapid technological changes, and rapid 
mechanization of industry, together with other causes, have resulted in 
new workers in the United States having an importance at the present 
time, numerically and otherwise, that they have not had at any pre- 
ceding Federal census. New workers, although without regular work 
experience, are a part of the Nation’s labor force. They are potential 
workers, and, hence, statistics relating to them are fully as important, 
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if not, indeed, more important, than statistics relating to other mem- 
bers of the labor force. Therefore, as a part of the 1940 Census, the 
Census Bureau collected information in regard to new workers. This 
information, classified by sex, age, color or race, and possibly corre- 
lated with information as to the highest grade of school completed, will 
be published in the census reports on occupation statistics. 


NEW TABULATIONS 


Tabulation plans for the 1940 occupation census are still incomplete. 
They will include the usual tabulation of occupations by sex, age, color 
or race, marital condition of females, and industry; and it is the purpose 
of the Census Bureau, if funds and time permit, to expand the tabula- 
tions to include additional areas and additional data. Tentative plans 
call for some tabulations for metropolitan districts; for some tabula- 
tions for the urban places of from 2,500 to 10,000 population; and for 
some tabulations for the towns of from 10,000 to 25,000 population. 
The 1930 census tabulation of workers, by sex and industry groups, 
published for counties and for cities of 25,000 or more, will be expanded 
to include major occupation groups, as well as industry groups; and 
tentative plans call for the tabulation and publication of these statistics 
for states, for counties, for cities of 10,000 or more, and for metropolitan 
districts.‘ In a special table the statistics for each state will be presented 
for urban, rural-farm, and rural-nonfarm areas. The data relating to 
the employed, the emergency workers, and those seeking work, re- 
spectively, will be tabulated separately; and data relating to hours 
worked, weeks worked, duration of unemployment, and income may be 
tabulated by occupation and industry. And the Bureau hopes that it 
may be possible to crowd into its rather full program tabulation of data 
relating to marital status of male workers; present occupation of 
workers, classified by usual occupation; school attendance of workers 14 
to 24 years old; and highest grade of school completed by workers. 
Some of these tabulations, if made, will be on a 5 per cent sample basis. 

4 For urban places of 2,500 to 10,000 population, and for the rural-farm and rural-nonfarm parts of 


counties, the same tabulation will be made, but the statistics will be published by major occupation 
group only. 








THE HOUSING CENSUS OF 1940* 


By Howarp G. Brunsman, Chief, Housing Statistics 
Population Division, Bureau of the Census 

HE HOUSING CENSUS of 1940, the first Nation-wide inventory of 

housing, was conducted as a part of the Decennial Census of that 
year and undertaken in response to the increasing public recognition of 
the importance of this field. This recognition is indicated by the efforts 
of the Federal Government to improve housing conditions through fed- 
eral agencies such as the Federal Home Loan Bank Board, the Home 
Owners’ Loan Corporation, and the Federal Housing Administration, 
all of which operate to lower the financial costs of home ownership; 
through the United States Housing Authority operating in the field of 
public housing for families of low income; and through the Farm 
Security Administration and Farm Credit Administration operating in 
the field of rural housing and farm finance. The Census of Housing was 
designed to provide essential housing facts to guide these and other 
government agencies interested in housing, as well as the local public 
groups and private concerns operating in the fields of real estate, mort- 
gage finance, housing, and the manufacture and distribution of house- 
hold equipment. 

Two different schedules were used in the process of enumeration. The 
primary one, the Occupied-Dwelling Schedule, was employed for the 
enumeration of all dwelling units that were occupied by households 
enumerated on the regular Population Schedule. It should be noted 
that the Census of Housing was conducted on the basis of complete 
coverage of the 34,750,000 occupied dwelling units in the United States. 

In addition to this primary schedule, the Vacant-Dwelling Schedule 
was employed to obtain data for all dwelling units not occupied by 
enumerated households. This group represents primarily the dwelling 
units that were vacant and for sale or rent at the time of the enumera- 
tion. In addition, the enumeration on the Vacant-Dwelling Schedule 
included the vacant dwelling units that were not on the market for 
sale or rent because they were being held for the occupancy of specific 
absent households. In this category were summer cottages and other 
dwelling units held for occasional occupancy. Dwelling units occupied 
by households who reported that their homes were located elsewhere 
also were enumerated on the Vacant-Dwelling Schedule. 

The various items covered by the Census of Housing of 1940 may be 
divided into six principal groups: 

* A paper presented at the 102nd Annual Meeting of the American Statistical Association, Chicago, 
December 28, 1940, brought up-to-date and coordinated with the other papers of this session. 
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1. Characteristics of residential structures. For each residential struc- 
ture, whether it contained 1 or 1,000 dwelling units, the following items 
were obtained: type of structure, conversion of structure, exterior ma- 
terial, need of major repairs, and year built. 

2. Occupancy status, tenure, and population characteristics. Dwelling 
units were classified as: occupied by enumerated households; vacant, 
for sale or rent; vacant, held for absent households; and occupied by 
nonresident households. Dwelling units occupied by enumerated house- 
holds were classified as owner-occupied and tenant-occupied, while 
other dwelling units were classified as ordinary or seasonal. Data re- 
garding color or race of head of household and total number of persons 
in the household were transcribed from the Population Schedule. 

3. Equipment and facilities of the dwelling unit. The items in this cate- 
gory include: number of rooms; water supply; toilet facilities; bathing, 
lighting, refrigeration, and heating equipment; heating and cooking 
fuel; and radio. 

4. Monthly rental or value of home. Data on value of home were ob- 
tained for each owner-occupied dwelling unit. Data regarding contract 
rent were obtained for each tenant-occupied dwelling unit; while esti- 
mated rent was obtained for each vacant unit. In addition, estimated 
rent was obtained for each owner-occupied nonfarm dwelling unit. 
Thus a rent figure, either contract or estimated, is available for each 
nonfarm dwelling unit. 

5. Furniture and utilities. Large variations are known to exist in the 
relationship between contract rental and size, age, or equipment of 
tenant-occupied homes. These variations are due in part to the facili- 
ties and furniture which are included in the contract rental of some, but 
not all, dwelling units. In order to adjust for this factor, information has 
been obtained on the monthly cost of utilities and fuel paid for by ten- 
ants in addition to contract rental. Data also have been obtained on 
the estimated rental without furniture of units for which furniture was 
included in the rental. By adding these utility costs to the contract or 
estimated rental without furniture a total is secured which is best 
suited for comparisons among the various types of dwelling units. This 
figure, which is designated as “gross rental,” will be presented. for 
each tenant-occupied nonfarm dwelling unit. 

' 6. Home finance. The data obtained for each 1-4 family owner- 
occupied nonfarm property without business include the value of 
property, mortgage status, and amount of indebtedness. In addition, 
the following characteristics of the outstanding first mortgage were 
obtained: interest rate, type of mortgage holder, frequency and amount 
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of regular payments, and whether or not these payments include re- 
duction of principal or real estate taxes. These data were not obtained 
for farm units, since a farm represents a dwelling place in combination 
with a business enterprise. Neither were they obtained for residential 
properties that were entirely tenant-occupied, because of the difficulty 
of locating and obtaining data from the owners of such properties and 
the fact that the tenants, as a general rule, are unable to supply the 
required information. 


PRELIMINARY REPORTS 


A series of housing reports presenting revised figures on the number 
of occupied and vacant dwelling units has already been issued for each 
state and for the United States. In these revised reports, data were pre- 
sented for each county, for each urban place with a population of 2,500 
or more, and for wards in cities of 100,000 or more. 

In addition to the revised reports issued on a state basis, a separate 
report for each of the 140 metropolitan districts presents the revised 
housing figures and the final population figures for the metropolitan 
district as a whole and for each minor civil division within the district. 
Another series of reports presents the population and housing data by 
census tracts for each of the 61 cities for which tracts have been estab- 
lished. Most of the reports in these two series, including a summary of 
the metropolitan districts, have already been released, and it is an- 
ticipated that the last of the two series will have been completed by the 
time this article appears in print. 

Also released recently were a number of special reports presenting 
revised figures on the average size of family in the United States, 
population and housing figures for urban places of 10,000 inhabitants 
or more, and a map of urban vacancy in the United States by counties. 
In process and scheduled for release in the immediate future are reports 
on vacancy figures and average size of family by size of city. Several 
other releases of this analytical nature are in preparation. 


FINAL REPORTS AND APPROXIMATE PUBLICATION DATES 


All of the above preliminary reports are based on “hand” counts of 
the dwelling units reported on the schedules. The more detailed 
statistics from the Housing Census will be obtained from the machine 
tabulations of three 45-column punch cards. The data on these cards 
are indicated by the description of the statistics to be issued in each 
series of final reports presented below. It is planned that brief advance 
reports, containing some of the more important information to appear 
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later in the final printed reports, will be released as soon as the data 
are tabulated and verified, thus advancing the release date by the time 
required for printing. 

First Series Housing Bulletins. A housing bulletin of this series will 
be issued for each state, bulletins for the less populous states being 
released first. One of the principal tables in this series presents for each 
county, by minor civil divisions and incorporated places of 1,000 in- 
habitants or more, the following characteristics of housing: 


Urban, rural-nonfarm, and rural-farm combined: 

Distribution of all dwelling units into: owner-occupied; tenant- 
occupied; vacant, for sale or rent; vacant, not for sale or rent. 

Number of units occupied by nonwhite households. 

Number of occupied units having 1.51 persons per room or more. 

Number of (occupied and vacant) units needing major repairs; 
number having no private bath; number either needing major 
repairs or having no private bath. 


Urban and rural-nonfarm combined: 

Number of structures and number of dwelling units. 

Number of owner-occupied dwelling units mortgaged. 

Average monthly contract or estimated rent (includes data for 
owner-occupied, tenant-occupied, and vacant units). 


Rural-farm: 
Number of dwelling units. 


Practically all of the above itemized data are also presented for 
each urban place, by wards for cities of 10,000 inhabitants or more, and 
minor civil divisions for metropolitan ‘districts. The rural-farm housing 
data, shown for each county by townships or other minor civil divisions, 
are similar to the above itemized nonfarm data except that mortgage 
status and average rental have been replaced by more extensive detail 
on plumbing and lighting equipment. 

In addition to the state bulletins, the First Series Housing Bulletins 
will include supplements for each of the 191 cities which had 50,000 
inhabitants or more in 1930. The statistics by blocks to be presented 
in these supplements will be the first ever published by the Census for 
so small a unit as the city block. The data for each block will contain 
all the itemized nonfarm data indicated above and in addition a dis- 
tribution of dwelling units by year built. Identification maps, prepared 
by WPA project workers, show the location of specific blocks, and will 
accompany the First Series Supplements for the 191 cities. 
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Experience gained in the local Real Property Surveys that have been 
conducted in several hundred communities indicates that these block 
data will prove of considerable value. With these data available, it is 
possible to construct maps showing the pattern of rentals, of overcrowd- 
ing, and of the various other factors throughout the city. The pattern 
is shown in much greater detail on this basis than is possible with data 
available for enumeration districts, census tracts, wards, or other 
larger areas. The block data also should prove useful in various market- 
ing activities. They will serve as control figures in locating substandard 
areas, and in determining relative economic status as measured by 
rental. 

The First Series Housing Bulletins for states and the block supple- 
menis for cities are scheduled for release starting in the fall of 1941. 
The entire series will be completed during the winter of 1941-42. 

Second Series Housing Bulletins. This series of bulletins will also be 
issued on a state-by-state basis. The general plan is to present the 
maximum amount of detailed statistics for the following areas: each 
state by urban, rural-nonfarm, and rural-farm areas; each city of 50,000 
inhabitants or more; and each metropolitan district. A reduced amount 
of information will be included for cities of 10,000 to 50,000, and a still 
further reduction will be made for urban places of 2,500 to 10,000, and 
the rural-nonfarm and rural-farm areas of counties. 

Included in the program for those areas for which statistics will be 
presented in greatest detail are the items in the first five of the six 
principal groups of items covered by the Housing Census as explained 
above. For some items, statistics will be presented for the total number 
of dwelling units in each of these areas. For many of the items, how- 
ever, data will be shown separately for the units that are owner- 
occupied, tenant-occupied, vacant for sale or rent, and vacant not for 
sale or rent. Included in this category are such items as type of struc- 
ture, toilet facilities, water supply, and year built. 

For a limited group of items the data will be presented for the owner- 
occupied and for the tenant-occupied units by color or race of the head 
of the household. Some of the more important items included in this 
category are state of repair and plumbing equipment, number of rooms, 
persons per room, contract or estimated monthly rent, value of home 
and mortgage status (for nonfarm owner-occupied units only), and 
gross rent (for nonfarm tenant-occupied units only). 

It is impracticable to publish the full line of information indicated 
above for each of the smaller areas since the number of areas involved 
is too large. As a result, for cities of 10,000 to 50,000 the subclassifica- 
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tions by occupancy, tenure, and color or race, except for color or race 
in the Southern states, have had to be reduced drastically. Still further 
reductions of this nature had to be made for urban places of 2,500 to 
10,000, and the’rural-nonfarm and rural-farm areas of counties. A few 
items such as structures by type of structure, and structures by exterior 
material were deleted entirely from the latter group, but in general it is 
planned to publish at least a minimum amount of information for 
practically all items for each of these small areas. 

The results presented in the Second Series Housing Bulletins will be 
among the most significant data obtained in the Housing Census. 
Public housing officials will be able to determine the number of dwelling 
units lacking various types of equipment which they consider essential 
for adequate housing. They also will be able to ascertain the number of 
occupied dwelling units in which excessive crowding exists. Real 
estate and construction groups will be able to obtain information 
regarding vacancy by rental groups, type of structure, number of 
rooms, and other essential controls, as well as data regarding the 
facilities and equipment of homes in various communities throughout 
the country. Marketing and sales organizations will be able to deter- 
mine the number and geographic distributon of dwelling units with 
various types of equipment for plumbing, heating, lighting, and re- 
frigeration, as well as radio, cooking fuel, and heating fuel. The figures 
on value and rental, in the form of averages and also by value and rent 
groups, will be useful in measuring general purchasing power. 

The first of the Second Series Housing State Bulletins is tentatively 
scheduled for release during the fall of 1941, while the last of the series 
and a summary for the United States i is s expected to be ready sometime 
during the spring of 1942. 

Third Series Housing Bulletins. This series of bulletins will present 
data showing the various cross-relationships existing among housing 
items, such as the relationship between monthly rental and number 
of rooms and type of structure. Users of the housing statistics will find 
these statistics useful in answering detailed analytical questions relating 
to housing, such as: Which of the various equipment items show varia- 
tions that are most closely related to the variations in rental? What is 
the size of families now living in substandard units? How much rental 
is paid for such units? An analysis of these more detailed statistics is 
essential in any comprehensive study of the problems of real estate 
and housing. Because of the extensiveness of the tabulations for each 
area, it is obvious that these more analytical tabulations must be 
limited to a relatively small group of areas. According to present 
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tentative plans, data will be published for the larger cities and metro- 
politan districts, and for the urban, rural-nonfarm, and rural-farm areas 
of states. 

The release dates for the various reports in this series, insofar as can 
be determined at present, extend from the winter of 1941-42 to the 
summer of 1942. 

Fourth Series Housing Bulletins. This series of reports will contain 
information on the characteristics of mortgage indebtedness on 1-4 
family owner-occupied nonfarm properties without business. In addi- 
tion to the various items relating to the mortgage, data will also be 
presented for several general housing characteristics, such as type of 
structure, year built, estimated monthly rental, number of rooms, and 
state of repair. All tabulations relating to the mortgage data and the 
relationship of mortgage data to general housing and population char- 
acteristics will be shown in this series of bulletins. These data will 
include such items as characteristics of the mortgages held by various 
types of mortgage holders, the relationship of mortgage indebtedness to 
value of property, and similar items. These tabulations will be used by 
home financing institutions and agencies in shaping their lending 
policies. Such data serve as a base in determining the volume and 
characteristics of the outstanding home mortgage indebtedness. 

The approximate release dates for the various reports in this series 
extend from the spring of 1942 to the fall of 1942. 

Fifth Series Housing Bulletins. The bulletins of this series present 
the interrelationships between certain of the housing characteristics 
and the characteristics of the families or households occupying the 
dwelling units. Dwelling unit characteristics include tenure, rent or 
value, type of structure, number of rooms, and persons per room. 
Family or household characteristics include household composition, 
number of related workers, total wage income of related members of 
the household, as well as occupation, work status, and place of residence 
in 1935 of the head of the household. Tabulations will include, among 
others, the relationships between wage income and value or rent, and 
the characteristics of the homes of workers in various occupation 
groups. 

These data will be obtained from a sample of the households enumer- 
ated in the census. The areas for which these data will be published 
probably will be limited to a small number of the largest cities and 
metropolitan districts; the urban, rural-nonfarm, and rural-farm areas 
of practically all states; and each state as a whole. The necessity for 
using these rather large areas arises from the size of the sample which 
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must be used and the nature of the data which requires, for best results, 
considerable cross classification. Release dates for this series of reports 
are tentatively set for the summer of 1942 to the fall of that year. 

These five series of housing bulletins represent a vast store of badly 
needed information on housing conditions. In collecting, tabulating, 
and publishing this information, the Census Bureau will continue to 
consider carefully, within the practical limits imposed by time and 
available funds, the suggestions, criticisms, and interests of the users 
of these statistics. A substantial part of the tabulation and publication 
program is now, of necessity, in final form, but many parts, particularly 
in the third, fourth, and fifth series of bulletins, are in less nearly com- 
plete form and, within certain limits, are subject to revision. 











A SIGNIFICANCE TEST FOR TIME SERIES ANALYSIS* 


By W. ALLEN WALLIS Asp Greorrrey H. Moore 
Stanford University and Rutgers University 
National Bureau of Economic Research 


time series. One shortcoming of tests in common use is that they 
ignore sequential or temporal characteristics; that is, they take no ac- 
count of order. The standard error of estimate, for example, implicitly 
throws all residuals into a single frequency distribution from which to 
estimate a variance. Furthermore, the usual tests cannot be applied 
when series are analyzed by moving averages, free-hand curves, or 
similar devices frequently resorted to in economics for want of more 
adequate tools. This paper presents a test of an opposite kind, one de- 
pending solely on order. Its principal advantages are speed and 
simplicity, absence of assumptions about the form of population, and 
freedom from dependence upon “mathematically efficient” methods, 
such as least squares. This test is based on sequences in direction of 
movement, that is, upon sequences of like sign in the differences be- 
tween successive observations (or some derived quantities, e.g., resid- 
uals from a fitted curve). In essence, it tests the randomness of the 
distribution of these sequences by length. 

Each point at which the series under analysis (either the original 
or a derived series) ceases to decline and starts to rise, or ceases to 
rise and starts to decline, is called a turning point. A turning point is a 
“peak” if it is a (relative) maximum or a “trough” if a (relative) mini- 
mum. The interval between consecutive turning points is called a 
“phase.” A phase is an “expansion” or a “contraction” according to 
whether it starts from a trough and ends at a peak, or starts from a peak 
and ends at a trough. For the purposes of the present test, the in- 
complete phase preceding the first turning point and that following the 
last turning point are ignored. The length or duration of a phase is the 
number of intervals (hereafter referred to as “years,” though they may 
represent any system of denoting sequence) between its initial and 
terminal turning points. Thus, a series of N observations may contain 
as few as zero or as many as N —2 turning points; and a phase may be as 
short as one year (when two consecutive observations are turning 


N% KNOWN SIGNIFICANCE TEST is entirely appropriate to economic 


* Presented (in slightly different wording) before the Nineteenth Annual Conference of the Pacific 
Coast Economic Association, Stanford University, December 28, 1940, and based on research carried 
out at the National Bureau of Economic Research, 1939-40, under Research Associateships provided 
by the Carnegie Corporation of New York. A fuller account of the method and its uses will be published 
soon by the National Bureau of Economic Research as the first of its new series of Technical Papers. 
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points) or as long as N —3 years (when only the second and penultimate 
observations are turning points). 

The greater the number of consecutive rises in a series drawn at 
random from a stable population, the less is the probability of an ad- 
ditional rise; for the higher any observation may be the smaller is the 
chance of drawing one which exceeds it. To calculate the expected fre- 
quency distribution of phase durations, only one weak assumption need 
be made about the population from which the observations come, 
namely that the probability of two consecutive observations being 
identical is infinitesimal—a condition met by all continuous popula- 
tions, hence by virtually all metric data. 

Without further postulates about the form of the population, it is 
possible to conceive a mathematical transformation of it leading to a 
known population, but leaving unaltered the pattern of rises and falls 
of the original observations. For example, if each observation is re- 
placed by its rank according to magnitude within the entire series, the 
ranks have exactly the same pattern of expansions and contractions as 
the original observations; and their distribution is simple and definite, 
each integer from 1 to N having a relative frequency of 1/N. The dis- 
tribution of phase durations expected among random arrangements of 
the digits 1 to N is, therefore, comparable with the distribution 
observed in any set of data. A little mathematical manipulation reveals 
that in random arrangements of N different items the expected number 


2(d?+3d+1)(N —d—2) 
(d+3)! 


3N — 11.6194 
2N—7 


To test the randomness of a series with respect to phase durations, 
the first step is to list in order the signs of the differences between 
successive items. Thus the sequence 0, 2, 1, 5, 7, 9, 8, 7, 9, 8 becomes 
+,-,+,+,+,—-, —, +, —. The signs are, of course, one fewer than 
the observations. The second step is to make a frequency distribution 
of the lengths of runs in the signs. There are four completed runs in the 
example just given (the first and last being ignored as incomplete), of 
lengths, 1, 3, 2, and 1. The frequency distribution thus shows two 
phases of one year, one of two years, and one of three years. In case 
consecutive items are equal but it can be assumed that sufficiently 
refined measurement would reveal at least a slight difference (an as- 
sumption valid whenever the test is applicable), the phase lengths are 


. The expected 





of completed phases of d is 





mean duration of a phase is , essentially 14. 
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tabulated separately for each possible sequence of signs of differences 
between tied items; and the resultant distributions are averaged, each 
being weighted by the probability of securing that distribution if each 
difference observed as zero is equally likely to be positive or negative. 
Third, the expected frequency for each length of phase is calculated 
from the formula above, taking as N the number of items in the se- 
quence being tested—in this case, 10. Next, the observed and expected 
frequency distributions are compared by computing chi-square in the 
usual way for testing goodness-of-fit: that is, by squaring the differ- 
ences between actual frequencies and corresponding theoretical fre- 
quencies, dividing these squares by the respective theoretical fre- 
quencies, and summing the resultant ratios. In nearly all applications 
of the present test, avoidance of expected frequencies that are too small 
necessitates restricting the distribution of phase durations to three 
frequency classes, namely one year’s duration, two years’ duration, and 
over two years’ duration, the theoretical frequencies for these classes 
being 5(N —3)/12, 11(N —4)/60, and (4N —21)/60, respectively. 

The sum of the three ratios of squared deviations to expectations is, 
then, similar to chi-square for two degrees of freedom, one degree of 
freedom being lost because a single linear constraint is imposed on the 
theoretical frequencies by taking the value of N from the observations. 
It is advisable, however, to distinguish chi-square for phase durations 
by a subscript p (denoting phase), because it does not quite conform to 
the Pearsonian distribution function ordinarily associated with the 
symbol x. The phase lengths in a single series are not entirely inde- 
pendent of one another; as a result, very large and very small values of 
x,’ are a little more likely than is shown by the x? distribution, and the 
mean and variance of x,” generally exceed those of x?. We have not 
determined the sampling distribution of x,? mathematically, but have 
secured empirically a substitute that appears satisfactory. In the first 
place, a recursion formula enabled us to calculate the exact distribution 
of x,” for small values of N. Table I gives the exact probability of ob- 
taining a value as large as or larger than each possible value of x,? for 
values of N from 6 to 12, inclusive. As a second step toward determining 
the sampling distribution, an empirical distribution of x,” was secured 
from 700 random series, 200 for N =25, 300 for N =50, and 200 for 
N =75. The three distributions for separate values of N were not homo- 
geneous with one another nor with the exact distribution for N =12; 
but the differences among them were unimportant for the present 
purposes, occurring chiefly at the higher probabilities rather than at the 
tail (the important region for a test of significance), and representing 
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TABLE I 


DISTRIBUTIONS OF xy?: EXACT FOR N=6 TO 12, AND APPROXIMATE 
FOR LARGER VALUES OF N 


(P represents the probability that an observed xy? will equal or exceed the specified value) 
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N=6 | N=9 N=11 N =12 | N>12 
xs? P xp? | P xp? P xp P xp? P 
.467 1.000 .358 1.000 .479 1.000 0.615 1.000 5.448 -10 
. 867 869 1.158 | -798 .579 .980 0.661 - 984 5.50 .098 
1.194 .675 1.267 631 .817 934 0.748 .896 5.674 09 
1.667 .453 1.630 | .605 917 844 0.794 -891 5.75 . 087 
2.394 .367 2.067 489 | .979 .730 0.837 .850 5.927 -08 
2.867 222 2.430 .452 | 1.088 -723 0.971 7 6.00 .077 
19.667 | 053 2.758 .381 1.279 -655 1.015 -720 6.163 .07 
oo , 3.158 | 374 1.317 57 1.061 -685 6.25 .069 
N=7 3.267 | .321 1.588 .537 1.415 .585 6.50 .061 
3.667 215 1.700 -473 1.461 583 6.541 .06 
Xp? | 4.030 .164 1.800 472 1.637 569 6.75 .054 
552 1.000 4.067 .144 2.079 .468 1.683 533 6.898 06 
.733 -789 4.758 110 2.200 .467 1.933 . 487 7.00 .048 
.752 -703 5.667 .078 2.309 466 1.948 .486 7.25 .043 
-933 .536 6.067 -064 2.409 .440 2.067 .428 7.401 -O4 
1.733 .493 7.485 020 2.417 -403 2.156 -427 7.50 .038 
2.152 .370 15.666 -005 2.500 -392 2.203 407 7.75 .034 
2.333 3202 | ———————_|_ 2.579 .384 2.289 344 8.60 .030 
3.933 .277 N=10 2.688 304 2.333 333 8.009 03 
5.606 . 169 2.809 274 2.556 .331 8.25 .027 
7.504 -117 Xp? P 3.026 -261 2.615 .303 8.50 .024 
8.904 -055 .328 1.000 3.109 -230 2.661 .303 8.75 -021 
a -614 -941 3.213 .201 2.733 .300 8.836 02 
N=8 -728 -917 3.300 147 2.837 .300 9.00 .019 
1.055 .813 3.779 | 147 2.870 287 9.25 .017 
Xp? a 1.341 -693 3.800 147 2.883 246 9.50 -015 
.284 1.000 1.419 -606 3.909 133 2.956 216 9.75 .013 
.684 .843 1.585 -601 4.117 128 3.267 211 10.00 012 
.844 -665 1.705 .594 4.313 .126 3.415 207 10.25 010 
.920 .590 1.772 .592 4.388 .099 3.489 149 10.312 01 
1.320 .560 1.814 -526 4.726 091 3.933 127 10.50 009 
1.480 -506 1.819 .419 5.000 077 4.070 127 10.75 008 
2.364 .495 2.313 -407 5.609. 077 4.156 114 11.00 007 
2.680 .471 2.577 374 5.700 .076 4.348 113 11.25 006 
2.935 -392 2.676 327 6.013 -055 4.394 113 11.50 006 
3.000 .299 2.743 327 8.200 -050 4.571 112 11.755 0065 
4.375 .293 2.863 274 8.635 .032 4.616 109 12.00 004 
4.455 - 235 2.905 242 9.468 -022 4.733 101 13.00 003 
4.935 .194 2.977 -220 9.735 .018 5.667 .092 14.00 002 
5.000 .133 3.242 -181 10.214 .009 5.803 .092 15.0865 -001 
5.819 -064 3.834 .179 11.435 .004 5.889 090 
6.455 .033 3.970 165 6.025 .090 
4.333 158 6.733 -085 
4.400 - 158 6.842 072 
4.676 -139 6.956 -060 
4.858 -107 7.504 -050 
5.128 .072 7.622 -041 
5.491 .059 8.576 -029 
6.515 -054 8.822 -026 
7.133 -042 9.237 .019 
11.308 -014 9.267 -014 
12.965 -006 10.556 -003 
19.667 -000 
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local irregularities rather than basic differences in form. It appears, 
therefore, that when N is as large as 12 a single sampling distribution 
of x,’ is sufficient. 

The mean of the 700 values of x,” is 2.3049, and the variance 5.0458. 
As an approach to the distribution of x,, it seems reasonable simply 
to reduce x,” by approximately one-seventh and refer it to the x? dis- 
tribution for two degrees of freedom, which has a mean of two and 
tables for which are readily available; and such a comparison does in- 
dicate good conformity. That the variance of the observed values is 
less than that of (7/6) x? for two degrees of freedom suggests, however, 
that a more satisfactory fit at the tails can be secured by using a dis- 
tribution having a variance of 5, e.g., x? for “two and one-half degrees 
of freedom.” For x,” above about 5.5 and P below about .10, the agree- 
ment of this distribution with the observations is very satisfactory. In 
the main body of the distribution the function whose mean value is 
equated to the sample mean gives a somewhat better fit. 

In practice, therefore, the procedure for interpreting x,”, assumed 
always to be calculated from three frequency classes, is as follows: 
If x,” is less than 6.3 (the point of intersection between the ogives of 
(7/6) x? for two degrees of freedom and x? for two and one-half degrees 
of freedom), reduce it by one-seventh and refer to the usual x? tables 
for two degrees of freedom. This procedure is satisfactory for all values 
of x,”, but for values above 6.3 somewhat more accurate results are ap- 
parently secured by referring x,” to the last column of Table I, which 
gives the distribution of x? for two and one-half degrees of freedom. 
When N <13 the exact distributions should, of course, be used. 

A simpler test of the same nature may be based on the fact that in a 
random sequence of N observations (where N is not too small—not less 
than 10, say) the total number of completed phases is normally dis- 
tributed about a mean of (2N —7)/3 with variance of (16N —29)/90. (In 
using this test, the difference between the observed and expected num- 
bers of phases should be reduced in absolute value by one-half unit, in 
order to allow for discontinu1iy.) This test of the total number of 
phases, which is essentially equivalent to a test of the mean phase 
duration, is normally less sensitive than the x,? test, which*takes ac- 
count of the lengths of the phases, though the superiority of the x,’ test 
in this respect is limited by the necessity of confining the frequency 
distribution to three classes. Advantages of the test of the total number 
of phases are that it is even simpler to apply than the x,’ test, that its 
sampling distribution is known exactly and is readily available, and 
that it is adaptable to cases where the hypothesis alternative to the null 
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hypothesis is either that the phases are abnormally long or that they 
are abnormally short. 

Application of the x,” test to an economic problem may be illustrated 
by an analysis of sweet potato production, yield per acre, and acreage 
harvested in the United States, 1868-1937, as recorded on page 243 of 
Agricultural Statistics, 1989. The frequency distributions of phase dura- 
tions in these series have been compared with those to be expected in a 
random sequence. From the values of x,? and their corresponding 
probabilities, it appears that the fluctuations in production conform 
with what would be expected in a random series; while of the two com- 
ponents of total production, yield per acre conforms well and acreage 
harvested does not conform at all. 

The figures on total production do not, of course, constitute a ran- 
dom series, for there is a marked upward trend in the data. In general, 
the method here presented is not very sensitive to primary trend. The 
removal of trend from a series, or its introduction into a trendless 
series, can change the sign of the difference between consecutive items 
only if the trend factor for a single year is greater than the difference 
between the items in trend-adjusted form. If, therefore, the residuals 
from trend are such that their first differences are rarely as small as the 
trend factor for a single year, as is frequently the case in economic time 
series, the distribution of phase lengths will not be much affected by the 
presence or absence of trend. Another factor tending to minimize the 
effect of trend on the test is that expansions are lengthened and con- 
tractions shortened if the trend is upward, and vice versa if it is down- 
ward, leaving the total number of phases of a given duration relatively 
unaffected. In such cases the existence of trend may be revealed by 
separate distributions for expansions and contractions. In the case of 
sweet potato production, both distributions conform well to the chance 
distribution and to one another; but for acreage the two distributions 
differ markedly, suggesting that the non-randomness evidenced in the 
acreage series may be at least partly attributable to trend. 

Lack of sensitivity to primary trend is a limitation of the technique 
from the point of view of detecting the existence of such a trend. On the 
other hand, it is not difficult to determine by other methods whether a 
primary trend exists—the rank correlation between the variate and the 
date often affords a convenient test. And for determining whether the 
systematic variation contains secondary components, e.g., cyclical or 
seasonal variations, it is a decided advantage of the present method 
that it frequently gives satisfactory results regardless of the presence of 
trend, thus avoiding the complexities of trend elimination. It is possible, 
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of course, for secondary fluctuations also to be concealed if their year 
to year magnitude is definitely less than year to year random move- 
ments; this is not so likely as in the case of a primary trend, but is a 
real possibility in the case of gradual movements—e.g., long waves. 

A second example illustrates the use of x,? as a criterion of the fit of 
moving averages, and for selecting the proper period for a moving 
average. If a moving average, or any other curve, describes adequately 
the systematic variation in a series, the residuals should constitute a 
random sequence. If the period is too long, waves or cycles may appear 
in the residuals, and if it is too short the residuals will cluster too closely 
about the line. To illustrate this application, ten moving averages 
having spans from 2 to 11 years have been fitted to the data on sweet 
potato acreage, and the residuals tested for randomness. Each moving 
average uses equal weights; the necessity of centering each average at 
an observation, however, means an implicit increase of one year in the 
span of averages based on an even number of points, with the first and 
last observations receiving half weight. The last two columns of Table 
IT show the values of x,?, and the corresponding probabilities, obtained 
by testing the residuals for randomness. The moving averages based 
on even numbers of years give notably better results than those based 
on the corresponding odd numbers of items; the tapering of the weight 
diagram implicit in the even averages evidently improves the fit. An- 
other striking feature is that the probabilities first rise and then decline. 
Thus, the odd averages attain a maximum probability of .24 at seven 
years while the even averages give the best result at six years, when 
the probability is .61. Had other weight diagrams been tested still 
better results might have been obtained. 

It should be noted that a “better” result is not necessarily one in 
which the curve gives a closer fit, but one in which the residuals behave 
more like a series of independent, random observations, as judged by 
sequences in signs of first differences. The closest fits to the original 
observations are given by the shortest moving averages; but these 
describe not only the systematic variation but also a portion of the 
random fluctuations. If the moving average is either too short or too 
long,x,” will be significantly large; but in the former case its magnitude 
results from an excess of short phases and a deficiency of long ones, 
while in the latter case the reverse is true. Table II includes the actual 
frequency distributions of residuals from the ten curves. 

In order to compare this new test with a more elaborate procedure 
frequently employed in time series analysis, a power series y=a+bzr 
+cx?+dz'+ - - - was fitted by the method of least squares to the series 
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TABLE II 
FREQUENCY DISTRIBUTIONS OF PHASE DURATIONS IN RESIDUALS FROM MOVING 
AVERAGES FITTED TO SWEET POTATO ACREAGE HARVESTED 
UNITED STATES, 1868-1937 
Duration of phase 
Span of 
moving | Over Total 
average One | Two two | (frequency) xp" of 
(years) lt aed yous years 
(frequency) (frequency) | (frequency) | 
2 Expected | 27.083 11.733 4.183 43 16.823 | .0004 
Observed 46 8 1 55 
3 | Expected 27.083 11.733 4.183 43 16.823 .0004 
Observed 46 8 1 55 
4 Expected 26.25 11.367 4.05 41.667 1.857 45 
| Observed | 31.25 8.25 4.5 44 
| | 
5 Expected 26.25 11.367 | 4.05 41.667 | 5.740 .09 
Observed 19.25 9 7.75 36 | 
| | 
6 Expected 25.417 11 3.917 40.333 1.141 61 
Observed 25.5 8.25 5.25 39 
| 
7 Expected 25.417 11 3.917 40.333 3.287 24 
| Observed 28.5 6 5.5 40 
| } 
8 | Expected 24.583 10.633 | 3.783 39 2.547 .34 
| Observed 25 7 6 38 
9 Expected 24.583 10.633 3.783 39 4.634 14 
Observed 18.75 7.5 6.75 33 
10 Expected 23.75 10.267 3.65 37.667 3.175 .26 
Observed 23 5.5 5.5 34 
11 Expected 23.75 10.267 “| 3.65 37.667 9.861 01 
Observed 22 2 7 31 ; 
t 
on sweet potato acreage harvested. The calculations were carried as far 
as the ninth degree term, using the technique of orthogonal polynomials, 
but none beyond the third effected a significant reduction in the 


residual variance. According to the usual criterion, therefore, a third 
degree curve would be regarded as fitting adequately. The residuals 
from the third degree curve were then submitted to the present test. 
There were 24 one-year phases, 3 two-year phases, and 9 phases of more 
than two years, producing a x,? of 12.47, from which it is clear that the 
fit of the third degree polynomial is quite inadequate. The fault, of 
course, lies in inferring that a third degree power series gives an ade- 
quate fit because no other power series gives a significantly better fit 
as judged by the standard error of estimate. 


IR TTT ae 


ee 


Se 


< — E ceeater 








Te TN IT a 8 








-A SIGNIFICANCE TEsT FoR Time SERIES ANALYSIS 409 


x,’ can also be used to test the independence of two variates, and in 
some circumstances is superior for this purpose to the rank correlation 
coefficient. The procedure is to arrange the pairs according to the order 
of magnitude of one variate and tabulate the distribution of phase 
durations in the other variate. If the two series are independent, the 
resulting value of x,” will not be significant. A difficulty, however, is 
that the conclusion occasionally depends upon which variate is chosen 
for arranging in order and which for counting the phase durations. 

It is perhaps advisable to emphasize explicitly that the present test 
by no means utilizes all of the information in the daca. In particular, it 
ignores the magnitude of the year to year fluctuations, treating the 
smallest as equivalent to the largest. A serial correlation coefficient 
computed from ranks may retrieve some of this information on magni- 
tude. Another consideration in interpreting the test is that a set of 
phase durations which appears random when viewed only as a fre- 
quency distribution may not have been arranged at random in time. 
An additional point, obvious but worthy of mention, is that the time 
unit used may affect conclusions derived from the x,” test; for example, 
year to year movements may appear random and month to month 
movements non-random, or vice versa. 








A NOTE ON THE MEAN AS A POOR ESTIMATE 
OF CENTRAL TENDENCY 


By R. J. BROOKNER* 
Columbia University 


T IS TOO OFTEN believed and taught that the “arithmetic mean is the 
best estimate of central tendency.” The usual refutation of this 
statement is the example which gives a variate x which has the Cauchy 
distribution 
1 
1 + (x — 6)? 


in which the mean can easily be shown to be a very poor estimate of the 
parameter @ which is the center point of the distribution. This example 
is generally accepted as being a pathological one and the argument 
given that the practical statistician never runs upon such a distribu- 
tion. 

The following example, in which we will show that the mean is a 
poor estimate of the center, however, is not open to the criticism that } 
it is not met in practice, nor can it in any sense be considered patho- 
logical. We will show that there exists a statistic which is simpler to 
compute than the arithmetic mean and which is so much better an 
estimate of the mean value that the ratio of its variance to the variance 
of the arithmetic mean approaches zero. Suppose a variate z is known 
to be distributed as a rectangular distribution of length 1 around an 
unknown value @. (The restriction of unit length is not necessary as any 
rectangular distribution of known length can be transformed to one of 
unit length very simply.) That is, we consider the distribution 


f(x) =1 if 6-3 S2x25060+ 
= 0 otherwise. 





1 
f(x) = ~ 





to|— 


Suppose N independent observations are observed, say 2, ®2,--- , Xn 
and we wish to estimate the unknown mean value @. We will compare 
the estimate of the parameter by using the arithmetic mean 


m+ +-:: + ty j 














— = 


with the estimate 





* Working under a grant-in-aid of the Carnegie Corporation of New York. 
The motivation for this simple calculation came from a discussion in a mathematical statistics 
course given by Dr. A. Wald of Columbia University. 
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where u is the smallest measurement and v the largest measurement of 
21, X2,° °°, tn. 

We will compare the two statistics by showing that each is an un- 
biased estimate of @, i.e. that the expected value of each is @, and by 
comparing the variances of the two statistics. If we consider the change 
of variables y=x—06+}, then we have that the distribution of y is 


fy)=1 if OSys1 
= 0 otherwise 


and we simply see that the mean value of z is @— 4 plus the mean value 
of y and that the variances of z and y are equal. Now the expected 
value of y is easily seen to be $ so the expected value of 7 is } and the 
expected value of Z is 6. We also know that the variance of the rec- 
tangular distribution is w?/12 where w is in this case equal to 1, so the 
variance of 7 and also the variance of # is equal to 1/12N. 

In order to compute the expected value and variance of the statistic 
t, we need the joint distribution of the maximum and minimum of N 
independent observations on y. To get this joint density function, we 
consider the probability that the minimum will be greater than a certain 
value U and simultaneously the maximum will be less than a certain 
value V. This is the probability that N independent observations will 
be between U and V and hence is (V—U)* if V is greater than U and 
both U and V are between 0 and 1, otherwise the probability is either 
0 or 1. It can easily be verified that then the density function is minus 
the cross partial of this probability (note that this probability is not the 
usual cumulative distribution function but is very similar to it), and so 


N(N -—1)v—w*" if OSusvsl 


0 otherwise. 


g(u, v) 


Then we have 


ut+v ’ ; 
E ( 3 ) = rf du f (u + v)N(N — 1)(v — u)*~*dv 
0 u 





bole 


so the expected value of the statistic computed for the z’s is 6. Now the 
variance of (u+v)/2 is 
E(u?) + 2E(uv) + E(v?) 


; (a)? 
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so we compute 


E(u?) -{ f wN(N — 1)(v — u)*—*dudv 
0 u 
2 





(N + 1)(N + 2) 


E(uv) -{ f uvN(N — 1)(v — u)*—*dudv 
0 u 
1 





N+2 

f f vN(N — 1)(v — u)*—*dudv 
0 u 
N 


~ N+2. 


E(v?) 





Hence the variance of ¢ is 
1 
2(N + 1)(N + 2) 
and the ratio of the variance of ¢ to the variance of Z is 
6N 
(N + 1)(N + 2) 


which is always less than one for N>2 and as a matter of fact ap- 
proaches zero as N becomes large. That is, the larger the sample, the 
larger is the waste of information in using Z to estimate @. 





























ON THE DISTRIBUTION OF THE PARTIAL 
ELASTICITY COEFFICIENT* 


By H. Greee Lewis 
The University of Chicago and Cowles Commission 


ARTIAL ELASTICITY is one of the most important and most frequently 
Picea concepts in economic analysis. With the development of the 
multiple regression technique in recent years, much statistical work 
on demand and production has been done in an effort to measure some 
of the most relevant elasticities. Estimates of these elasticities are 
made by differentiating multiple regression functions fitted by least 
squares to actual statistical data. 

In such cases determination of the reliability or significance of the 
estimates is desirable. Before this can be done it is necessary to cal- 
culate the probability distribution of the coefficient estimates. Where 
the classical least squares multiple regression snecification is appropri- 
ate (at least to a first approximation), it is possible to obtain this 
probability distribution exactly. 

Approximations to the exact test of significance (under the least 
squares specification) were given in this JouRNAL by Professor Henry 
Schultz! in 1933 and Mr. Jacob L. Mosak? in 1939. This note calls at- 
tention to a paper by Mr. E. C. Fieller in Biometrika in 1932, in which 
he derived the sampling distribution of a ratio whose numerator and 
denominator are normally correlated.’ It can easily be demonstrated 
that the partial elasticity coefficient is such a ratio, if the least squares 
specification is appropriate. 

Fieller showed that if z; and 2: are any two normally correlated 
variables with means 2; and Zs, variances o;’ and o;,’, and a correlation 
coefficient r:2, the probability, Vo, of obtaining by random sampling a 
ratio 
(1) y= — 


22 


* The author is indebted to Mr. Louis Court, Mr. John H. Smith, and Mr. Jacob L. Mosak for their 
helpful suggestions. 

1 “The Standard Error of the Coefficient of Elasticity of Demand,” March 1933, pp. 64-69. 

2 “The Least-Squares Standard Error of the Coefficient of Elasticity of Demand,” June 1939, pp. 
353-361. 

3 “The Distribution of the Index in a Normal Bivariate Population,” Biometrika, XXIV, 1932, pp. 
428-440. See also: R. C. Geary, “The Frequency Distribution of the Quotient of Two Normal Vari- 
ates,” Journal of the Royal Statistical Society, 1930, pp. 442-447; H. L. Rietz, “On the Frequency Dis- 
tribution of Certain Ratios,” Annals of Mathematical Statistics, September 1936, pp. 145-153; G. A. 
Baker, “Distribution of the Means Divided by the Standard Deviations of Samples from Non-homo- 
geneous Populations,” Annals of Mathematical Statistics, February 1932, pp. 3-5. 
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not less than a given ratio, vo, is 


(2) Vo = f f G(w, We)dw,dw2 +f G(wi, We)duidwe, 
h k —k 














where 
( 1 
G(wi, We) = —_______. g[-1/2(1-p?) ) (wi? 2pwiw2tw2?) 
2rv/1 — p? 
Ze 
| eS 
02 
21 — Uoze 
k= - , 
(01? — 272010200 + Va") 1/2 
(3) 
T1201 — Vo0d2 
= : 
(o;? — 2ri20102V0 + V0"o2") 1/2 
22 
wW=- 
02 
21 — Voz 
We. = 





[ 7 (o;" — 2ry20102009? + V0702”)*/2 , 
Tables VIII and IX of Tables for Statisticians and Biometricians, Part 


II, may be used to evaluate the integrals in equation (2). 
As Fieller has shown, equation (2) may be written in the form 


(4) =A+ iz — oon, 
where 
20 +k 
(5) A= -{ G(wi, We)dwidwr. 
h ak 


The maximum value of | A| is obviously te C/V Bajo Aden Hence 
when h is large, | A| is small. For example, when h=3, | Al <0.003. 
Thus, for large values of h, (4) reduces to the speciation, 


6 V ~fo —= e~ °F l*dypy. 
(6) ine = ~ 
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That is, for large values of h, the quantity 
21 
v —_ S$ —— 
Ze 





(6a) 
— (o? a 2ry201090 a v2g”) 1/2 
22 
is distributed approximately normally with zero mean and unit 
variance.‘ 

It will now be shown that the partial elasticity coefficient under the 
least squares specification is a ratio whose numerator and denominator 
are normally correlated. 

Denote by X the dependent variable, and by Yi, Yo,---, Y% the 
independent variables. Let Y be the true (population) value of the re- 
gression function (population mean of an X array), and 


7) X' = Dob 
t=] 

the least squares estimate of Y. The f; are known functions of the k 
independent variables. Let ¢® be the common population variance of 
the X arrays—i.e., the population (squared) standard error of estimate. 

The (sample) value of the partial elasticity of X with respect to 
Y; is then, 

Y;>, b Si 
Y; aX’ int—(<+éYY g 
xX’ OY; — 
Dd diffi 
t=] 

Thus (for given values of the independent variables) both numerator 
and denominator of e; are linear functions of the regression coefficients 
b;. But the 5; are normally correlated.’ Hence the numerator and de- 
nominator of e; are normally correlated.* Therefore the distribution of 
e; is the distribution of the index »v. 

Write z, for the numerator and z2 for the denominator of e;, %, and Z2 
their population means, o;? and o,? their population variances, and rj2 
their correlation. The set of normal equations for the regression (7) is 








(8) e; 


(9) > b,Mn — Mx, = 0 (s = 1,2,---, m) 


rel 


4 This approximation is also given by Geary, op. cit., p. 442. 

5 For a proof see: John H. Smith, Tests of Significance: What They Mean and How to Use Them, 
Chicago, 1939, pp. 81-84. 

* This follows from the familiar proposition that two linear functions of the same set of normally 


correlated variables are normally correlated. 
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where M,,=<Zf,f, and Mx,= ZXf,, the summations extending over 
the n observations. Write D for the determinant of the regression coeffi- 
cients b,, in equations (9), D,, for the cofactor of D corresponding to the 
element M,,, and C,, for D,,/D. 

Then it can easily be verified that 








m m of, Of. 
2=— W(z, — 3,)2 = Y,2e2 Cue 
a, “~®) ai » 2 aY; aY; 
(10) a2" = E(2e — Ze)? = é py } Culds 
r=1 g=l 
m m of, 
ryoi02 = E(a, — 2)(22 — ®) = Y;e >> ys Crfe—° 
( r=l1 s=l OY; 


The C,, may be computed by the familiar Doolittle method.’ 

It should be emphasized that tests of significance using equations (2) 
or (6) are appropriate only when n is large so that good estimates of ¢& 
can be obtained. 

It may be useful to compare the above results with those of Schultz 
and Mosak. Both Schultz and Mosak assumed implicitly that under 
the least squares specification e; was (approximately) normally dis- 
tributed with mean 2,/Z2. Mosak’s approximation to the square root of 
the sampling variance of e; was® 





, 1 , a 
(11) oj = — a1” — 2riwo102 — + — 92". 
22 22 2° 


2 


Notice that (11) is identical to the denominator of (6a), if in (11) 
2/22 is replaced by its estimate 2:/22. In actual practice this would gen- 
erally be done, so that the test of significance as given by Mosak 
is essentially identical to that using the approximation (6) and valid 
only to that extent. In the illustrative example used by Mosak,® 
h>80 (really an approximation of h using unbiased estimates of Z2 
and a2). Thus, in this example, Mosak’s approximation to the exact test 


based on equation (2) is excellent.'° 


7 For example, see the method given in Henry Schultz, The Theory and Measurement of Demand, 
Chicago, 1938, Appendix C. 

8 Op. cit., p. 355, equation (7). I have taken the liberty of translating Mosak’s notation into mine. 
The same equation was also given by Schultz, who, however, could not evaluate ris. 


9 Op. cit., pp. 360-361. 

10 Subject to the much more important qualification (of which both Mosak and Schultz were quite 
aware) that the illustrative multiple regression was fitted to time series, and hence the appropriateness 
of the least squares specification is in doubt. Moreover, the estimate of ¢? was based on less than 20 
degrees of freedom. The test based on equation (2) would, of course, similarly not apply. 








THE USE OF PER CAPITA FIGURES FOR 
DEMAND CURVES* 


By Apour Kozuix 
Iowa State College 


OR DETERMINATION of empirical demand curves with the help of the 

method of multiple correlation! it is necessary to eliminate the in- 
fluences of factors other than price. Then, only the relation between 
consumption and price, i.e. the demand curve, remains. 

In order to be able to exclude the influences of those factors on the 
demand curve, the factors must be independent. They must not be 
influenced themselves by the changes in quantity and price of the com- 
modity, but they must cause the fluctuations of price or quantity which 
are to be excluded. If a change in the price of the commodity causes 
a change in some factor which in turn causes a change in the quantity 
consumed, then the change in the price indirectly causes the change in 
the quantity consumed. This relationship between changes in the price 
and changes in the quantity consumed which are dependent on changes 
in the price, is to be retained; it should not be eliminated. 

It is the purpose of this paper to demonstrate that the uncritical use 
of per capita figures in the computation of demand curves violates this 
generally acknowledged principle. 

This may be exemplified by the relationship between corn and hogs. 
A decrease in the price of corn causes an increase in the number of hogs 
on farms, which leads to an increase in the consumption of corn. If we 
neglect other uses of corn and changes in the amount of corn fed per hog, 
the decrease in the price of corn can influence the consumption of 
corn only by means of an increase in the number of hogs. If the 
number of hogs were independent of the price of corn, the consumption 
of corn would also be independent of the price of corn. The manner in 
which the number of hogs responds to a change in the price of corn 
determines the demand curve for corn instead of disturbing it. If the 
quantity of corn consumed is correlated with the number of hogs in 
order to eliminate the influence of changes in the number of hogs, the 
correlation will be perfect because the quantity is correlated with itself.” 

* Journal Paper No. J-854 of the Iowa Agricultural Experiment Station, Ames, Iowa. Project No. 
710. The author acknowledges the helpful criticism by T. W. Schultz and Geoffrey Shepherd. 

1 See the work of Henry Schultz, The Theory and Measurement of Demand, 1938, and many others. 

* To be more exact: The quantity is correlated with a proportional value. The same holds true if 
the deviations of the quantities from the price-quantity regression curve instead of the quantities are 
correlated with the number of hogs and if the price-quantity regression curve is adjusted according to 


the deviations from the quantity-hogs regression curve. The result does not depend on the sequence in 
which the regression curves are drawn, i.e. it makes no difference whether the corn consumption is first 
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The correlation between the number of hogs and the corn consump- 
tion is not perfect, as an empirical analysis shows (see Chart I),* be- 
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cause corn is not consumed exclusively by hogs; it is also used in the 
wet-process industry, fed to other livestock, or exported.‘ Correlating 





correlated with the number of hogs and the deviations are correlated with the price, or the corn con- 
sumption is first correlated with the price and the deviations are correlated with the number of hogs. 
If the price deviations from the price-quantity regression curve instead of the quantity deviations are 
correlated with the number of hogs, it is the same as if the price were twice correlated with the quantity 
consumed. This is as misleading as the other correlatien with the number of hogs. (See Geoffrey Shepherd 
and Walter W. Wilcox: Stabilizing Corn Supplies by Storage, Agricultural Experiment Station, Iowa 
State College, Bulletin 368, December 1937, pp. 340 ff. 

3 The adjusted index of determination between corn production and the number of animal units on 
farms as of January 1 of the following year is r? =0.32, the regression equation being 
(1) y =17.8044 +-0.00507z 
where y is the number of animal units and z is bushels of corn. These correspond to the data used by 
Henry Schultz. Data corresponding to those used by G. Shepherd (corn supply and hog slaughter under 
federal inspection during the season from October to September) give r? =0.47 and 
(2) y =3.0195 +90.002582 
where y is thousand pounds of hogs slaughtered (see Chart II). Using corn production instead of corn 
supply (see G. Shepherd and W. Wilcox, op. cit., p. 316) gives 


(3) y =3.3759 +-0.002642 
and r? =0.48 (Chart III). Adjusting for suddenly increased crops (see loc. cit., p. 317) gives 
(4) y =2.9537 +0.002992 


and r? =0.72 (Chart IV). 

4 In addition there are statistical errors and the consumption is not the same for all hogs and for all 
periods. If the corn price is low more corn is fed per hog and the weight of hogs increases. This is partly 
taken care of if the total weight of hogs slaughtered is used instead of the number of hogs on farm. See 
equations (2) to (4) in footnote 3 and Charts II to IV. 
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corn consumption with the number of hogs eliminates only that part 
of the influence of corn price on corn consumption which is exerted by 


CHART III CHART IV 








13000 13000 


i 








12000 }— - 2 








11000 


i 








i 





8 
8 








8 
8 

















OCTODER-SEPTEMBER (In Millions of Pounds) 
: 
is) 

OCTOBER-SEPTEMBER (In Millions of Pounds) 








HOGS SLAUGHTERED UNDER FEDERAL INSPECTION 


HOGS SLAUGHTERED UNDER FEDERAL INSPECTION 















































6000 6000 

% 

3s 

TO00 7000 

Ss se 

6000 
1400 ‘700 2000 2300 2600 2900 3200 1400 TTOoO 2000 2500 2600 
PRODUCTION OF CORN (in Millions of Bushels) ADJUSTED PRODUCTION OF CORN 


(in Millions of Bushets) 


changes in the number of hogs.’ The demand curve obtained is not the 
demand curve for corn in general but the demand curve for corn used 
for purposes other than hog feed. It is analogous to a correlation with 
the exports of wheat that leads to the demand curve for wheat used for 
purposes other than exports, i.e. to the curve of domestic demand for 
wheat. 

The use of per capita figures may be legitimate if the change in the 
population is not perceptibly affected in the short run by the price of 
the particular commodity. An example is the change in the size of the 
human population. We may assume that a change in human population 
influences the demand curve for food in a proportional manner.® This 
influence of the change in the size of population is best eliminated by 
deflating the quantity series (for example, wheat consumption) by the 
population series, a procedure that assumes the demand curve to 
change proportionally with the change in the size of population, re- 


5 A high correlation between the deviations of the quantities from the price-quantity regression 
curve and the number of hogs probably indicates that most corn is fed to hogs. The smaller the devia- 
tions from the quantity-hog regression curve, the less reliable they probably are for determining the 
demand curve for other uses of corn. 

6 See Henry Schultz, op. cit., pp. 143 ff. See, however, Lowell J. Reed, “A Form of Saturation 
Curve,” this Journat, Vol. XX, 1925, pp. 390-396. 
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gardless of the level at which price is fixed.’ The per capital consump- 
tion of wheat can be used in the computation of a demand curve for 
wheat because we may assume correctly that the change in the human 


CHART V CHART VI 
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population does not depend on the short-run changes in the price of 
wheat. 

The statistical similarity between wheat consumption per capita and 
corn consumption per hog probably led Henry Schultz® to the analogous 
deflation of the corn consumption by the number of animal units, which 
does not yield the demand curve, as noted above.* Obviously Henry 
Schultz neglected the important difference between people and hogs, 
that the number of people is unaffected in the short rua by the annual 
changes in wheat price, while the number of hogs is affected by the 
annual changes in corn price. 

As in the case of corn, Schultz uses the consumption per animal unit 
in the computation of the demand curves for hay and oats.'® The ob- 
jections raised to the use of corn consumption per hog in determining 


7 See Henry Schultz, op. cit., note 25, p. 144. 

8 Henry Schultz, op. cit., pp. 242 ff., particularly points IV and V on p. 243 and point VII on pp. 
269 ff. 

® G. Shepherd follows Schultz in this respect (G. Shepherd and W. W. Wilcox, loc. cit.; G. Shepherd, 
W. K. McPherson, L. T. Brown, and R. M. Hixon, Power Alcohol from Farm Products: Its Chemistry, 
Engineering and Economics, Iowa Agricultural Experiment Station, 1940, pp. 345 ff.) with the difference 
that Henry Schultz deflates by division and G. Shepherd by graphical correlation. 
10 Henry Schultz, op. cit., Chapters IX and XII. 
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PRODUCTION (SUPPLY) AND CONSUMPTION UNITS 








Corn Hay Oats 











| 
Produe- | Supply* P rodue- y ja Produe- , Produc- ; 

tion® t.- tion* ad- Animalt ter%t tion* Animalt tion® | Animalt 
Million | Sept. | justed§ | units loot sept.| Million | units | Million | | unite 

“ Million | Million | Millions Million short millions be, Millions 

| bu. bu. Ibs. tons 

1915 2829 | 34.12 91.44 26.96 | 1435 25.73 
1916 2425 34.04 98.63 27.67 1139 26.02 
1917 2908 35.41 85.02 28.24 1443 26.52 
1918 2441 35.83 82.29 28.15 1429 26.53 
1919 2679 34.30 92.49 26.03 1107 25.35 
1920 3071 33.53 91.67 26.40 1444 24.79 
1921 2928 33.50 84.82 26.25 1045 24.37 


1922 2707 2957 2707 35.02 11440 95.15 25.89 1148 23.89 
1923 2875 3004 2763 33 .96 12010 89.42 25.37 1227 23.27 
1924 2223 2393 2223 31.13 10260 91.45 24.71 1416 22.59 
1925 2798 2900 2415 29.78 9780 78.83 23.93 1405 21.96 
1926 2547 2825 2547 29.63 10010 76.03 23.14 1153 21.10 
1927 2616 2833 2570 30.22 10820 98.15 22.61 1093 20.49 
1928 2666 2758 2633 29.33 11320 83 .84 22.36 1313 19.89 
1929 2521 2669 2521 28 .67 10530 87.28 22.49 1113 19.44 
1930 2080 2216 2080 28.47 10200 74.73 22.63 1275 19.00 
1931 2576 2744 2245 29.38 10650 74.72 22.94 1124 18.66 




















1932 2931 3201 2694 30.26 10920 83.7 23 .56 125i 18.48 
1933 2400 2786 2400 29.96 9870 74.94 24.16 733 18.32 
1934 1461 1798 1461 25 .37 6740 60.00 23 .02 542 17 .65 
1935 2304 2369 1742 25.80 7190 89.53 22.63 1195 17 .33 
1936 1507 1687 1507 25.48 7540 70.39 22.34 785 17 .02 
1937 2651 2717 1888 25.44 8090 82.62 21.89 1162 16.64 
1938 2542 2904 2542 26.26 9310 90.74 21.88 1054 16.42 














* Source: Agricultural Statistics. 

t+ Source: 1915-1928—Henry Schultz, op. cit., pp. 671 ff; 1929-1938—Computed as indicated by 
Schultz and spliced on basis 1928; stocks on farm on January 1 of the following year. 

t Source: The Livestock Situation. 

§ A weighted average of the current year (once) and the previous year (twice) when the production 
increased, and the current year only when the production decreased. 

¥ Federally inspected. 


the demand curve for corn are also relevant in these cases. But as can 
be seen from Charts V and VI," the correlation between total produc- 
tion of hay or of oats and the number of animal units is lower than 
between corn and hogs. Consequently the number of animal units used, 
for the computation of the demand curve is much less dependent on 


11 The regression equation between production of hay and animal unite is 
(5) y =17.1820 +0.0853r 


and r? =0.11 (Chart V) where y is the number of animal units and z is short tons of hay. The regression 
equation between production of oats and animal units is 


(6) y =11.7491 +-0.0082z 
and r?=(.27 where z is bushels of oats (Chart VI). 
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the supply of hay or oats than the number of hogs is on the supply of 
corn. Deflation of the total supply by the number of animal units does 
less harm than in the case of corn and hogs, but one has to be aware 
of the danger also in this case and carefully check on the dependence. 

The smaller the effect of a change in the price of the commodity upon 
the number of animal units, the less objectionable is the use of per 
capita figures. The greater the effect of the independent change of 
the number of animal units upon consumption and the price of the 
commodity, the greater is the advantage of using per capita figures. It 
is up to the research worker to decide in each case which effect prevails 
and whether the advantages or disadvantages of the use of per capita 
figures are greater. Sometimes this may be a hard job because the two 
effects cannot always be separated and compared. 








FEDERAL RESERVE BANK OF NEW YORK INDEXES 
OF PRODUCTION AND TRADE 


By Norris O. JOHNSON 
Federal Reserve Bank of New York 


HIS NOTE is written to bring up to date the article, “New Indexes of 

Production and Trade,” published in the June, 1938, issue of this 
JOURNAL, principally to set forth certain modifications which have been 
made in composition and weights. The changes that have been in- 
troduced do not represent a departure from the original method of 
computation; they are, rather, an integral part of it, since they cor- 
respond to the procedure applied when the indexes were worked up for 
1919-1937, which was the “historical” period at the time that the 
indexes were first computed. In other words, the changes reflect an 
effort to keep the composition of the indexes as much as possible in 
harmony with structural changes in the economy itself, and to make 
use of new data to increase the coverage of the indexes. In the same way 
that the indexes evolved from about 60 series in 1919 to a considerably 
broader basis of 82 series in 1937, the availability and use of additional 
data have permitted further enlargement of the base to 88 series at the 
beginning of 1941. The actual increase in the breadth of the underlying 
data has been even greater, since individual series have, by and large, 
tended to improve in quality. 

More specifically, the changes fall into three main categories: 

1. Replacements of series formerly used with revised, improved or 
more inclusive data. 

2. Addition of new series covering lines of activity not formerly in- 
cluded but important enough to warrant representation. Certain 
of the new series, such as aircraft added in January, 1940, improve 
the coverage of industries which have been greatly stimulated by 
the National Defense program. In all, nine new series have been 
added since the original article was published. 

3. Changes in weights, not only to compensate for the addition or 
subtraction of other series, but also to reflect the growing or 
diminishing relative importance (as indicated by census data) of 
various lines of activity. 


In some instances trend lines have been readjusted, though generally 
without important net effects (at the points of shift) on the levels of the 
1 Two series—newsprint paper production and wholesale grocery sales—have been dropped without 


direct replacement. Replacements resulted in an additional reduction of one in the number of individual 
series used, leaving a net increase of six. 
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indexes.” All seasonal adjustment factors have been reviewed periodi- 
cally and revised in the light of the performance of recent data. 

The most prominent feature of changes introduced over the past 
three years has been the further development of man-hour series to 
measure activity in industries where there is no simple unit in which 
output may be expressed. In this respect, the record of the Bureau of 
Labor Statistics data on average hours worked per week, lengthening 
with the passage of time, together with the earlier National Industrial 
Conference Board figures, has been a valuable resource. Although it 
should always be held in mind that man-hours of employment data are 
not direct measures of production, the former can be expected to be 
correlated with production and with necessary adjustments can shed 
light on activity in important sectors of industry which might otherwise 
lack coverage. The following thirteen man-hours series are in current 
use: furniture, shipbuilding, chemicals, commercial printing, men’s 
clothing, electrical apparatus, aircraft, commercial baking, foundries 
and machine shops, agricultural implements, radios and phonographs, 
machine tools, and manufacturing as a whole. While most of these 
represent new additions, those on furniture, shipbuilding, commercial 
printing, and men’s clothing replaced less satisfactory series, and the 
chemicals series replaced three less inclusive ones (paints and varnishes, 
fertilizers, and cottonseed oil). 

While the current weights of new man-hours series total 7.6, the sum 
of the weights of all man-hours series has been increased only from 10.5 
to 12.4. The weight given man-hours of employment in manufacturing 
as a whole has been cut sharply since the need for such an over-all 
series has been reduced by the addition of man-hours series for par- 
ticular lines of activity not theretofore covered. 

While the results do not necessarily constitute a proof of the ac- 
curacy of the indexes, it is interesting to note that the index of produc- 
tion and trade as a whole shows rather close correlation with estimates 
of the percentage of the laboring force employed, and also with National 
income figures (the latter on a per capita basis, deflated by a cost of 
living index, and adjusted for estimated trend). The production and 
trade index has a somewhat greater amplitude than either the percent- 
age of labor force employed or the deflated National income figures, a 
circumstance that may be attributable in some part to the fact that 
the production and trade index, despite its relatively broad basis, may 
not adequately cover those fields of occupation where employment and 





2 In one case trend revision involved recomputation of the index of distribution to consumer and of 
the index of production and trade from 1936 onward. 
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COMPOSITION OF PRODUCTION AND TRADE INDEXES* 
1937 AND 1941 




















Weights in total index Number of series 
1937 1941 1937 1941 
Producers’ Durable Goods 10.9 13.6 16 20 
Producers’ Nondurable Goods 12.3 13.6 16 14 
Consumers’ Durable Goods 5.6 6.0 r) 6 
Consumers’ Nondurable Goods 17.6 19.0 | 265 26 
Man-hours of employment, manufacturing as a whole 8.6 3.0 | 1 1 
Production 55.0 55.0 61 66 
Primary Distribution 16.1 | 15.8 9 s 
Distribu:ion to Consumer 26.0 26.0 6 6 
Miscellaneous Services 2.9 3.2 6 8 
Index of Production and Trade 100.0 100.0 | 82 | 8s 





* It should be noted that production items, except Man-hours of employment, manufacturing as a 
whole, are classified into subgroups (italicized section of the table) according to their immediate—as 
opposed to ultimate—use. Thus Producers’ Nondurable Goods includes production of materials which 
later find their way into consumers’ goods as well as into Producers’ Durable Goods. 


incomes are relatively stable. In the comparison between the production 
and trade index and the percentage of labor force employed, the lesser 
amplitude of fluctuations in the latter may be explained to an appreci- 
able extent by the fact that, because of cyclical changes in the hours of 
work offered employed workers, the number employed fluctuates less 
widely than the aggregate amount of employment measured in man- 
hours.’ 

3 Tabulations of the indexes of production and trade (1919 to date) and material on present weights 


and composition are available upon request. Current indexes are published in the Federal Reserve Bank 
of New York's Monthly Review of Credit and Business Conditions, Second Federal Reserve District. 














TOWARD STANDARDIZED SYMBOLS FOR BASIC 
STATISTICAL CONCEPTS 


By FrepeEeriIcK E. Croxton 
Columbia University 


HE NON-PROFESSIONAL USER Of statistical text and reference books 
ee complains that he is confused by the varying symbols em- 
ployed by different authors. This is a difficulty which also besets the 
student of statistics, especially the beginner, who has not had extensive 
mathematical training. Many a student, not understanding a topic and 
having been urged by his instructor to read some book in addition to 
the regular text, returns to report, “I don’t understand — either; I can’t 
figure out his symbols.” Having gotten used to one set of symbols, he 
finds that the transfer to a partially or wholly different set is trouble- 
some. 

The student with a mathematical background and the professional 
statistician are not bothered to the same extent by the non-existence 
of standard symbols. Having been told once what symbols are being 
used and what each symbol stands for, they are ready to go ahead and 
follow the author’s development. However, even the most intelligent 
and mathematically competent reader is slowed down in his perusal of 
a volume if his mind must adjust itself to new symbols as well as to new 
ideas. 

If our statistical literature, and most important of all our statistical 
texts, could be written with a standard set of symbols for basic statisti- 
cal concepts, users of all levels of competence would profit immeasur- 
ably. But, most important of all, the beginning student would benefit 
since his confusion when referring to different texts would be mini- 
mized. The vast majority of the students who take one or two courses 
in statistics do not expect to become statisticians and their instructors 
have no intention of training them as they would train those students 
who expect to become professional statisticians. There must be scores 
who are interested in obtaining a working knowledge of the basic prin- 
ciples of statistics, so that they may occasionally read the results of 
statistical studies or occasionally apply modest statistical procedures 
in their every-day work, to every one who enters the field intending to 
specialize. Statistics courses have a reputation for being difficult, but 
they are not uninteresting if properly taught and they can be much 
less difficult if the symbols are standardized on some reasonable and 
logical basis. 

The accompanying table shows for various concepts associated with 
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ardize symbols is that all statistical texts but one would be rendered 
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the frequency distribution the symbols used in five well known texts 
published in the United States. Since it is not intended to suggest that 
any one of these texts has any better set of symbols than the others, 
but merely to indicate the lack of standardization, the texts are not 
identified. 

It is not the purpose of this article to set up a proposed set of stand- 
ard symbols for basic statistical concepts. That should be a cooperative 
task. However, it would seem that a standardized set of symbols should 
conform to these, and possibly other, criteria: 

1. If asymbol is already in more or less general use it should be the 
standard symbol for that concept. This does not mean that because a 
leading text or an eminent authority has used a certain symbol, that 
symbol should therefore be retained. 

2. Where possible the symbol should suggest the concept. Thus f 
suggests frequency, N number of items, and s or ¢ a standard devia- 
tion. 

3. The same symbol should not stand for more than one concept. 
For example p is sometimes used to indicate the coefficient of rank 
correlation and also the coefficient of non-linear correlation. 

4. The same concept should always be indicated by the same 
symbol. Thus if X;, X2, X3, etc. are used to indicate the variables in 
multiple correlation, it does not seem completely logical to use X 
and Y for two-variable correlation. 

5. The symbols should be easy to write. In printed matter it is 
possible to show a symbol in either light face (c) or bold face (¢) type. 
This can not be done readily in long hand, especially when an instruc- 
tor is writing rapidly on the blackboard as he lectures. 

6. The symbols should be easy to use orally. Thus &, ¢, &, and @ 
can be readily printed but are not easily used in oral discussion. 

7. The symbols should be those which are generally available to 
printers, requiring a minimum of special type to be made when a 
book or article is printed. 

8. The symbols should exhibit internal logic. Thus, the symbol for 
the scatter around an estimating equation should be of the same type 
as the symbol used for the standard deviation, since the former is a 
standard deviation around an equation. Also, Y=a+bX is better 
than certain other forms for the straight line since it lends itself to 
the addition of cX?, dX*, and so on to indicate curves of higher 
degree. 

One objection, which has been made to previous attempts to stand- 
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obsolete. This argument is based on the false hypothesis that some one 
text now has an ideal set of symbols—or that some one author can 
talk louder and faster than the other members of the standardizing 
group. It is much more likely that setting up a set of standard symbols 
will make a new edition in order for all statistical text books. 
Progress might be assured by the appointment of a small, but repre- 
sentative and open-minded committee which, after due deliberation 
might first submit any controversial phases to a sample of the Associa- 
tion’s membership and finally determine upon a standard set of sym- 
bols for basic statistical concepts. 
SYMBOLS USED FOR CONCEPTS ASSOCIATED WITH THE FREQUENCY DISTRIBUTION 
IN FIVE TEXT BOOKS PUBLISHED IN THE UNITED STATES 


(Where no symbol is shown in this table the author either did not discuss the concept or made use of no 
symbol. If a text uses the same symbol for a sample concept and a population concept, the symbol used 
for the sample concept is usually altered when the topic of reliability and significance is treated.) 





Text A | Text B | Text C | Text D | Text E 























Concept 

Item of a series x m xX m x 
Number of items N N N n N 
Summation z= >» >» >» >> 
Arithmetic mean of sample zx M, X X M,M; xX 
Deviation from arithmetic mean x d zx d x 
Assumed or designated mean x’ M' Xa R > + 
Deviation from arbitrary origin d’ d d 
Deviation from arbitrary origin, in class in- 

tervals d d’ a’ dj 2° 
Frequency f f f f f 
Class interval Ci i i i c 
Median Med. Md Med Md Median 
Mode Mo. Mo Mo Mo 
Geometric mean My, My, G GM G 
Harmonic mean Mi H H HM H 
Arithmetic mean of population . M Xp Mm 
Standard deviation of sample o 0,8 o 0,0z O,8 
Standard deviation of population Ox o Op o o 
Standard deviation of population, estimated 

from a sample 8 o 3’ 
Standard error of arithmetic mean ou Om 8z oz Ou, Om oF 
Standard error of difference between two arith- 

metic means Opitt. o Os = Od C- =- 
Degrees of freedom “ “G7 D, Dg aM 
Moments around arbitrary origin »’ v v » 2p’ 
Moments around arithmetic mean . Ba . m m 
Moments around arithmetic mean, with Shep- 

pard’s corrections m mn ue m p* 
Computed normal curve ordinates 

















NOTES AND DISCUSSIONS 


ON THE CALCULATION OF CORRELATION COEFFICIENTS 
WITH MODERN CALCULATING MACHINES 


I 


In this Journat for December 1940, Dr. P. S. Dwyer points out in a paper 
with nearly the above title that squares may be formed on the Monroe model 
Al calculating machine with asingle setting. In case any priority in this vir- 
tue is claimed by or for the Monroe Company it seems only fair to point out 
that certain models of the Madas machine (manufactured in Switzerland 
and therefore perhaps not well known in the United States) have been able 
to do this for six or seven years—a fact that was first brought to my notice 
by Dr. R. Taggert, then of the Shirley Institute, Manchester. The repeat 
key must be held down by a special spring clip (rubber bands were used in 
the early days!), the number set, and the multiplication bar pressed twice. 

I should like to take the opportunity of protesting against the use of the 
terms “upper dials,” “middle dials,” and “lower dials” when describing 
operations on calculating machines. What is the upper dial on one machine is 
the lower dial on another and may easily be the lower dial on a later model 
of the same machine. Hence the terminology is confusing—an objection that 
does not apply to “product register,” “multiplication register,” and “setting 
register.” 

L. J. ComRIE 

Scientific Computing Service Limited, London 


II 


It was not the purpose of the article to claim any priority for any company 
but to point out to the readers of this JourNAL who were not aware of this 
development that the availability of machines having this feature indicates, 
in many cases, immediate changes in the method of calculation of correlation 
coefficients. The Monroe appears to be the only American automatic electric 
calculating machine which uses the same setting mechanism for both mul- 
tiplicand and multiplier. This statement is limited to calculating machines, 
as the term is technically used, and does not refer to a punched card machine 
such as the IBM automatic multiplying punch, which has been used in ob- 
taining squares from a single punching operation for more than a decade, 
nor to machines which are primarily adding machines. 

The remark of Dr. Comrie that the Madas also qualifies is of interest and 
leads one to inquire as to the existence of other non-American makes with a 
similar mechanism. Without taking time to outline the mechanical proper- 
ties of each make of machine, it can be said that necessary conditions for the 
most efficient computation of the square from a single setting include: (1) 
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the machine must be electric (or at least not operated manually), (2) it must 
be fully automatic in the operation of multiplication (including an automatic 
shift), and (3) it-must use the same part of the same setting mechanism for 
multiplicand and multiplier (or at least it must be so constructed that a 
single setting operation will set the number to be squared as multiplicand 
and multiplier). 

In line with the implication that the discussion should be less provincial, 
I have attempted to secure information about all calculating machines which 
satisfy the three conditions though the information was obtained, in part, 
from written descriptions of the non-American machines rather than from 
actual contact with them. Also the incidence of the war has presumably 
slowed up improvements in British and Continental makes and it certainly 
has made it more difficult for us to obtain information about non-American 
machines. The following statements should be interpreted with the above 
facts in mind. 

We might define a “modern” computing machine to be one which satisfies 
conditions (1) and (2), and also (4) one which is equipped with fully auto- 
matic division. Most, if not all, of the machines satisfying (1) and (2) also 
satisfy (4) and for our present purpose we confine our discussion to these. 
Out of a list of over twenty different makes of calculating machines I find 
that only Fridén, Hamann, Madas, Marchant, Mathematon, Mercedes 
Euklid, Millionaire, Monroe, and perhaps Archimedes (Germany) have elec- 
tric models which are fully automatic in multiplication. Each of these ma- 
chines has been examined with respect to the third condition and it appears 
that the Madas and the Monroe are the only ones which satisfy it. (I have 
been unable to obtain information as to whether or not the Archimedes has 
fully automatic models and, if so, if condition (3) is satisfied.) It is possible 
that minor mechanical additions might be made to such a machine as the 
Mercedes Euklid Model 38 SM, which uses different portions of the same 
setting mechanism for multiplicand and multiplier, to obtain the square 
from a single setting. But it appears that drastic revision would be needed 
in the mechanical construction of the other makes mentioned if this end 
were to be attained. Much of this information has been gleaned from the 
reports of Office Machines Research, Inc., Rockefeller Center, New York. 

I agree with the comments of Dr. Comrie regarding such terms as “upper 
dials,” “middle dials,” and “lower dials.” These terms were not used in my 
paper, with the single exception of an “upper dial” in a situation where the 
context indicated the revolution register. It appears to me that the terminol- 
ogy “product register,” “revolution register,” and “setting mechanism” is as 
satisfactory as any which is applicable to all types and varieties of calculat- 
ing machines. This is the terminology arrived at by Office Machines Research 
in the attempt to describe and compare the various computing machines by 
using a universal nomenclature. 

P. S. DwYER 


University of Michigan 
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THE EFFECTS OF FEDERAL REVENUE ACTS OF 1938, 1939, 
AND 1940 ON THE REALIZATION OF GAINS 
AND LOSSES ON SECURITIES 


Comment 


The formulae developed in the December, 1940, issue of this JourNAL by 
Messrs. Walker, Collins and Higgins for determining the effect of the income 
tax on the desirability of shifting security holdings leave much to be desired 
as guides to investment policy. In the case of realization of capital gains, 
important differences in potential tax liability are ignored, and in the case of 
realization of losses a formulation of the problem is used that appears un- 
realistic. 

On page 607 a formula is presented for calculating the amount of apprecia- 
tion required in a proposed new investment in order that the market value 
of this new investment purchased with the net proceeds after tax of the sale 
of old investment which has already appreciated in value shall equal the 
current market value of the old investment. But the taxpayer who thus has 
succeeded in restoring the total market value of his holding is not in a posi- 
tion comparable to that which would have resulted had the taxpayer held 
on to his original investment, assuming that the market value of the original 
investment would remain unchanged. He is potentially much better off be- 
cause the “basis” of his present investment is much greater and the tax that 
he will have to pay if he ever realizes the assumed capital gain on the second 
investment is so much less than that payable had he held on to the original 
asset. The formula given therefore cannot be generally used to determine 
whether a proposed shift in investments is likely to be advantageous. The 
only case where the formula given would be a correct guide to procedure 
would be where the assets would be held until the death of the taxpayer, and 
so escape tax altogether (it would have to be assumed that this loophole in 
the income tax law remained open at the time of the taxpayer’s death) or 
where the gain could be realized at a time when the taxpayer had other losses 
which he otherwise would not be able to offset against income. 

A true comparison would involve considering the position of the taxpayer 
at some future date after the liquidation of either asset and the payment of 
all taxes on the assumed capitai gains. Thus if asset A is purchased at a cost 
of 1 and sold at (1+), where R is the ratio of gain realized to the cost of the 
asset, the tax will be T7R, where T is the applicable rate of tax. The amount 
available for the purchase of the second asset B will be (1+R—TR); if the 
market price of B advances from 1 to (1+r), then the amount realized from 
the sale of B will be (1+R-—TR)(1+1r), the tax will be tr (1+R—RT), 
where ¢ is the tax rate applicable at the time the second gain is realized, and 
the net amount realized and available either for expenditure or for further 
reinvestment if desired, will be (1+ —7R)(1+r—ir). If on the other hand 
the original asset had been held until this time and then disposed of, the 
tax would be Rt’, assuming that there has been no further change in the 
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market value of A and where the applicable rate of tax t’ may not be the same 
as t because of the difference in the category in which the asset falls because 
of the difference in time held. The switching of assets will have been advan- 
tageous or otherwise according to whether(1 +R —7R) (1 +r —tr) is greater or 
less than (1+ —?’R). Solving forr, we have r = R(T —?’')/(1+R -TR)(1 —-2) 
as the return on the new investment in order to obtain net proceeds equal to 
that obtainable by retaining the present investment. Assuming that 7 and 
t’ are equal then the switch will be profitable whenever r is greater than zero, 
a result that is at variance with what appears in the article. If ¢ and ¢’ are 
both assumed to be zero, then these two expressions will be equal when 
r=(RT)/(1+R—-RT) which is the formula given on page 607. This is only 
a very special case, however, which occurs only if the taxpayer has losses 
which would otherwise be non-deductible, or if the taxpayer renounces fur- 
ther dealings in the assets involved until his death. 

If any generalization can be made about the effect of the income tax upon 
the desirability of realizing capital gains it is that there is no substantial 
effect if the applicable tax rate remains the same and interest be neglected. 
By far the largest effects appear to be derived from the differences between 
the rates on long and short term capital gains and losses. 

On page 610 a formula is presented for calculating the appreciation re- 
quired in a new asset in order that the investor may recoup the original cost 
of an asset that has suffered depreciation. The use of this formula as a cri- 
terion implies that the investor has such confidence that his present invest- 
ment will regain the lost ground, that he will not shift his investment unless 
the new investment promises to more than recoup this loss. A wise investor 
would usually be willing to shift his holdings if there existed a substantial 
expectation of recovering only part of the loss. Were there a general expec- 
tation that all securities would eventually regain a formerly attained level, 
there could be no decline in security yalues! 

Thus neither of the formulae presented appear to provide a widely appli- 
cable “method for mathematical appraisal of the desirability of realizing 
capital gains and losses.” For this purpose the formula r= R(T —?’)/(1 —t) 
x<(1+R—TR) seems more appropriate both for gains and losses, provided 
only that RF and r are given the proper sign according to whether they repre- 
sent gain or loss, and the proper marginal rates of tax in effect are substituted 
for T, t, and ¢’. 

It is difficult to see any “good argument” for using the average rate on the 
total income rather than the marginal rate in these formulae (see footnote 7). 
The only case where such a rate could possibly be appropriate would be the 
case where the gain from the proposed transaction would constitute the entire 
income of the taxpayer. Even then it is seldom necessary to dispose of a 
capital asset in its entirety, particularly in the case of a block of securities. 
Where analysis based on marginal rates dictates a shift while that based on 
average rates does not, this usually indicates that the most profitable course 
is to sell a portion of the investment. The exact amount would be determined 
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by reference to the marginal rates, as for the initial part of the transaction 
taxes will be incurred at the lowest rates and savings will be made at the 
highest rates, the advantage of further sales diminishing as the size of the 
transaction is increased. Cases where the entire individual’s income is in- 
volved are sufficiently rare to be curiosities rather than typical cases. 
WiiuiaM VICKREY 


Rejoinder 


We have examined Mr. Vickrey’s comments on our article in the Decem- 
ber, 1940, issue of this JourNAL entitled “The Effects of the Revenue Acts 
of 1938, 1939 and 1940 on Realization of Capital Gains and Losses on Se- 
curities.” His criticisms seem to arise out of a misunderstanding of the pur- 
pose of the article. At page 602, we said plainly: “The sole purpose of this 
article is to provide the investor a reasonably simple method for evaluating 
the tax factor.” The tax treatment of capital gains and losses is only one of 
many factors to be considered before a decision can be reached on the wisdom 
of the sale of a particular security. The formulae were not designed to deter- 
mine whether a shift in holdings is desirable, but merely to measure the 
magnitude of the tax handicap under the assumed conditions. A clear under- 
standing of the specific premises underlying the formulae is a primary requi- 
site for constructive criticism. 

At page 609, we stated that the objective of the capital gains formula was 
determination of the necessary percentage appreciation on the second pur- 
chase to recoup the -warket value of the security sold. Mr. Vickrey challenges 
the validity of this conception of the problem. He contends that the “switch 
will be profitable whenever r is greater than zero, a result that is at variance 
with what appears in the article.” We are well aware that if r is greater than 
zero the net proceeds of the second sale will be greater than the amount of 
cash invested in the second security. We never posited such a truism. Our 
interest was not liquidated values, but rather a measure of the tax handicap 
to be overcome before the market value of the first investment could be re- 
established. The size of that handicap is one of the many factors the fund 
administrator must consider. 

We note with special interest his contention that use of the capital gains 
formula would be a correct guide to procedure only “where the assets would 
be held until death of the tax-payer, and so escape tax altogether (it would 
have to be assumed that this loop-hole remained open at the time of the 
tax-payer’s death).” Reductio ad absurdum is a tricky form of reasoning. It 
should be used with great caution. We never implied in any manner that tax 
considerations were the sole reasons for the retention or sale of any particular 
security. This is an absurd distortion of our premises. The so-called “loop- 
hole” is a novel conception because our government imposes estate taxes 
upon the fair market value of security holdings as of the date of death of the 
descendant or within a stipulated period thereafter. Apparently, Mr. Vickrey 
believes in the possibility of an income tax on unrealized capital gains at the 
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death of the descendant in addition to an estate tax upon the same market 
values. We entertain considerable doubt about the wisdom of closing this 
so-called “loop-hole” and the imminence of such a change. 

In the case of the capital loss formula (see p. 609), we sought a means for 
determining the percentage appreciation needed on a second purchase to 
permit recoupment of the original capital investment. Here Mr. Vickrey as- 
sumes that the formula “implies that the investor has such confidence that 
his present investment will regain the lost ground that he will not shift his 
investment unless the new investment promises to more than recoup his 
loss.” We can find no logical basis for such an assumption. Our capital loss 
formula was developed to measure the percentage gain required when con- 
sideration is given to the advantage of the tax treatment of capital losses. 
We are quite aware of the need for sale when retention is not likely to be 
fruitful. Such a situation requires no formula for decision. Our interest here 
was a large group of cases where the advantage of taking a tax loss must be 
appraised in relation to the possibility of appreciation on the retained hold- 
ing. Experienced investors are familiar with this problem. 

Mr. Vickrey singles out footnote 7 (see p. 608) for comment. We said there 
that “good argument can be advanced for using the average rate of tax on 
total income” instead of the highest applicable rates on income. We do not 
think his criticism well taken. In the main body of the article we used the 
highest bracket method. The footnote was inserted to give the problem ade- 
quate coverage since there are investors who prefer using the average rate of 
tax. The theory of using the “high bracket” method was developed many 
years ago by bond salesmen to push the sale of tax exempt securities. By 
assuming that each new investment increased income and thus added to the 
amount of income subject to the highest taxes, tax exempt securities were 
made to look more attractive. Frequently, this results in distortion of com- 
parative yields. 

It often happens that the new investment merely replaces another and 
does not add to income; and the income in the next succeeding year from 
other sources may dwindle to such an extent that the advantage of the tax 
exempt security disappears or is negative. The average rate does not distort 
the value of tax exemption so seriously; and it recognizes that there is some- 
thing to be said for the theory of using rates more nearly approaching those 
applicable at the time of incidence of income. Contrary to Mr. Vickrey’s 
assumption, tax experts frequently find cases where gains from sales are the 
sole source of net income. Profits from sale of real estate may often be the 
only income of the tax payer. However, the whole issue of “average” versus 
“high bracket” rates is not very important. The formulae permit use of either 
rate as the investor may prefer. 

This is not the place to expound the principles of fund management and 
the relationship of tax burdens to sound policy. When one has long studied 
and observed closely the evils of indiscriminate tax selling to establish losses 
and the equally important failures to sell because of exaggerated notions 
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about the magnitude of tax burdens upon capital gains, there would seem to 
be a place for measurement of the arithmetic significance of this factor by 
methods based on generally accepted premises of good investment manage- 


ment. 
Q. Forrest WALKER 


W. T. CoLiins 
D. E. Hiaeins 


COMMITTEE ON NOMINATIONS 


President Riefler has appointed the Committee on Nominations for 1941. 
The Committee consists of Theodore O. Yntema, University of Chicago and 
the Cowles Commission for Research in Economics, Chicago, Illinois, Chair- 
man; Halbert L. Dunn, Division of Vital Statistics, Bureau of the Census, 
Washington, D. C.; and Dorothy 8. Thomas, University of California, 
Berkeley, California. The report of the Committee on Nominations will be 


published in the November BULLETIN. 
R. L. FunxHovuser, Secretary 








BOOK REVIEWS 


GLENN E. McLAvuGHLIn 
Review Editor 


Studies in American Demography, by Walter F. Willcox. Ithaca, New York: 
Cornell University Press. 1940. xxx, 556 pp. $4.50. 


Professor Willcox, a pioneer among American demographers, became 80 
years old last March; he may well be rated dean of our demographers. His 
contributions over the past 50 years, whether published or personal through 
his offices in our Association, in our Government, or in international organi- 
zations, fully entitles him to be held in the high esteem with which we regard 
him. His writings have inevitably been scattered widely—ZI nternational Jour- 
nal of Ethics, Forum, Economic Studies of the American Economic Association, 
Quarterly Journal of Economics, American Law Review, in addition to jour- 
nals more especially statistical. (A bibliography of his more important writ- 
ings is given on pp. 541-7.) 

For all these reasons it is a real service to demographers and quantitative 
sociologists that the author has collected some of the most important of his 
essays and some unpublished material from his lecture notes into one volume 
where they can be readily consulted and where they form not only a partial 
record of his life-work but a background for the work we associate with the 
names of Dublin, Ogburn, Osborn, Rice, Notestein, Stouffer, Dunn, D. §. 
Thomas, Lorimer, Hutchinson and a lengthening line of persons working 
with social statistics. As a pioneer, Willcox had not at his disposal statistical 
materials as good as those today available, nor methods so highly developed; 
but 40 or 50 years hence that may be said in retrospect of those working now. 
It is probable that at any time the really vital problems of sociology for 
which a quantitative discussion is illuminating will have to be treated with 
inadequate data and insufficient methods that will tax the ingenuity and try 
the patience of the investigator, leaving to those who must have adequate 
data and sufficient methods only the mopping up operations in problems no 
longer of the first importance and interest. 

Willcox has divided his book into a section of twelve studies in American 
Census Statistics, another of eight studies in American Registration Sta- 
tistics, one of four miscellaneous studies, and three appendices. A good index 
and a special tabular contents in which the 225 tables given in the text are 
separately listed are thoughtful provisions for the assistance of the readers. 
I would not give the impression that these nearly 600 pages are merely a 
record or a monument; the author well points out that the nearest analogue 
of his book is the first twelve chapters of George Tucker’s Progress of the 
United States in Fifty Years, published nearly a century ago. A well rounded 
picture of our progress and of our present condition may not be necessary 
for certain special studies in demography but will always be of the utmost 
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importance for those whose object is to discuss certain important aspects 
of society in the large on a factual basis. 
Epwin B. WILSON 


Harvard University 


Théorie analytique des associations biologiques, by Alfred J. Lotka. Deuxiéme 
partie. Analyse démographique avec application particuliére @ l’espéce hu- 
maine. Paris: Hermann & Cie. 1939. 149 pp. 50 fr. 


Population growth and change are regulated by many factors—some bio- 
logical, some economic, some social or cultural. Lotka points out that this 
multiplicity of variables makes the study of actual populations exceedingly 
complex and an approach by deductive methods particularly advantageous. 
In fewer than 150 pages he presents an unusually clear and beautifully writ- 
ten exposition of recent progress in the mathematical theory of biological 
populations, with particular applications to human populations, a very sub- 
stantial part of which is drawn from his own scientific papers. 

The deductive analysis begins with the concept of a closed population, one 
that has no gains or losses by migration, subject to a probability function, 
p(a), that an individual newly born will live to at least age a. From the actual 
or expected annual number of births over a period of time equal to the span 
of life it is then possible to compute the expected number of individuals in 
the population at the end of the period. Expected values for age distributions, 
birth rates, mortality rates, and special results for Malthusian and stationary 
populations follow. By an effective use of Thiele’s semi-invariants the theory 
is extended and applied to actual and hypothetical populations. It requires 
for its completion, however, an additional function expressing the age specific 
birth rate, stated for simplicity in terms of daughters born alive per annum 
per female of age a. This leads to the solution of an integral equation with but 
one real root, p, which expresses the intrinsic rate of growth or decline in the 
population, apart from periodic components of diminishing amplitude that 
arise from the arbitrary character of the initial age distribution and that are 
expressed by the complex roots. 

After establishing the general theory, Lotka applies it to the distribution 
as to year and generation of the progeny of a cohort of females, to the com- 
parison of various indices and measures of natural increase, to marriage and 
sterility rates, to family composition and orphanhood, and to the extinction 
of a line of descent. 

Almost all of the theory is developed on the assumption of fixed survivor- 
ship and fecundity functions. To the population statistician this will appear 
to limit the value of the theory when applied to the human populations of 
contemporary Europe and America, profoundly affected as they are by the 
military and economic turbulence of our time as well as by the progressive 
decline of reproduction rates. The function of the theory is not, however, to 
supplant inductive studies or provide precise descriptions of what is actually 
taking place; it is to guide and unify population research, particularly in 








438 AMERICAN STATISTICAL ASSOCIATION: 


those phases that are most closely akin to actuarial studies. From it will 
spring more elaborate theoretical formulations that incorporate migration, 
changing fertility and mortality functions, and other factors in the complex 
processes of population dynamics. 

FREDERICK F. STEPHAN 


Cornell University 


Blood Pressure Study, 1989. New York: Actuarial Society of America, and 

Association of Life Insurance Medical Directors. 1940. ii, 69 pp. 

This study represents the second publication by the Joint Committee on 
Mortality dealing solely with the subject of blood pressure. As a result of the 
contribution of fifteen insurance companies the body of data is the largest 
ever assembled. More than 1,000,000 policies in which there occurred ap- 
proximately 49,000 deaths were included in this investigation. The policies 
were issued from 1925 to 1937 inclusive, and terminated at their anniver- 
saries in 1938, for this study. Only white males were included, and further 
to limit the volume of material, those cases in which the systolic pressure 
was less than 128 mm. of mercury and concurrently the diastolic pressure 
was less than 84 mm. were excluded. 

Basic tables giving age specific mortality rates for all causes ard by certain 
specific causes are provided and are utilized for the calculation of expected 
deaths. The 49,000 odd deaths are broken up into numerous dichotomous 
tables according to systolic and diastolic blood pressure for all the deaths 
taken together, and also in certain specific age groups. These observed deaths 
are compared with the expected deaths in tables giving the percentage ratio 
of actual to expected deaths. The same sort of analysis is made to compare the 
actual with the expected deaths from certain specific causes, that is, cardio- 
vascular-renal diseases, cirrhosis of the liver, cancer, pneumonia, tuberculo- 
sis, suicide and accidents. 

Inspection of the tables indicates broadly that up to certain limits of sys- 
tolic and diastolic pressure at the different ages the actual and expected 
mortality rates are about the same, but that beyond these limits the actual 
mortality rate is greater than what should be expected on the average. These 
points are discussed briefly in the volume, with respect both to the deaths 
from all causes and to deaths from specific causes. 

It occured to the present reviewer that the analysis in these tables might 
furnish a basis for establishing “normal” blood pressures which would be 
more meaningful than the usual “normal” tables that consist of the averages 
for persons considered to be “normal.” An objective definition of the upper 
limit of “normal” might be taken advantageously to be the systolic and dias- 
tolic pressures beyond which the actual mortalities are greater than the 
average. To make such an estimate critically would probably involve some 
smoothing of the material, and this reviewer feels that this might be worth 
while. 

Mayo Clinic RoBERT GAGE 
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The Criminality of Youth, by Thorsten Sellin. Philadelphia: American Law 
Institute. 1940. 116 pp. $1.50. 


In this booklet, Mr. Sellin brings together salient statistical data on the 
criminality of youth and divcusses some of the criminological significance of 
the data. In chapter 1 the author indicates the limitations of existing statis- 
tics in determining criminality by age. He then presents data indicating the 
extent of arrests and convictions of persons in different ages, later relating 
these to appropriate population bases in order to obtain specific arrest and 
conviction rates. In general, the available statistics indicate that the crime 
rates are highest among youths aged 19. The data also show that specific 
types of crimes have modes at different age levels, ranging from 18 for prop- 
erty crimes, through the 20’s for violence against persons and crimes in- 
volving fraud, to the 30’s for crimes of desertion and the like. The foreign 
statistics presented suggest a mode at a somewhat higher age, though this is 
inconclusive since the data are given in age groups. 

The remaining three chapters deal with the problem of recidivism. In the 
first of these, the author reports that, while the majority of persons finger- 
printed upon arrest have had no prior record, three-fifths to two-thirds of 
those committed to a penal institution have had prior conflicts with the law, 
thereby sustaining the conclusion that “imprisonment, and especially im- 
prisonment in State prisons and reformatories, of those convicted of serious 
offenses appears to be largely a punishment reserved for offenders who have 
had many previous contacts with the law and who have previously served 
terms in penal and correctional institutions.” 

Chapter 3 deals with comparisons of first offenders and recidivists. Mr. 
Sellin’s most significant conclusion in this chapter is that many criminals are 
apprehended only once in their lives. This conclusion is well-nigh inevitable 
when one notes that conviction rates decline with age. In discussing the 
probability that persons who have once been convicted will again be con- 
victed for subsequent offenses, Mr. Sellin errs in saying that a person is more 
likely te become a second offender than a first offender. What the author 
intended to say is that with each offense the probability of being convicted 
for a new offense increases in comparison with the probability that those who 
have never been convicted of an offense, or who have had fewer prior convic- 
tions, will be convicted. 

The final chapter deals with recidivism among youthful criminals. In this 
chapter the author attempts to answer two questions: (1) “What proportion 
of beginners in crime is found in youth compared with later age groups?” 
(2) “Are those who commit their first offenses during youth more or less 
likely to become recidivists than those who commit their first offense later 
in life?” 

With respect to the first question, some statistics are presented to show the 
proportions of convicted persons with prior convictions in different age 
groups. These proportions show an increase with age, reaching a maximum 
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in the late 30’s and then declining. It is doubtful whether the statistics in the 
form presented, are of maximum significance in view of the fact that, by 
and large, the likelihood of having been convicted once in one’s life-time 
increases with advancing age. 

Mr. Sellin does not attempt to explain the decline in the proportion of con- 
victed persons with records of previous convictions in the higher age levels. 
This decline does call for some explanation, in view of the fact that one is 
dealing with an additive phenomenon, so that the proportions of those with 
prior convictions should normally increase with age or at least remain con- 
stant instead of declining. 

With respect to the second question, Sellin says that he knows of no Ameri- 
can data that bear on the point. Therefore, it may be worthwhile to call at- 
tention to an article, “A Comparative Study of Certain Characteristics of 
1,000 Inmates of the Northeastern Penitentiary,”! in which the age distri- 
butions of first offenders and recidivists are compared, showing that recidi- 
vists tend to have a somewhat lower average age, indicating the greater 
susceptibility to recidivate among those who are convicted in their youth. 
The foreign data on this point cited are consistent with this conclusion. Mr. 
Sellin concludes his book with the statement, “Adequate treatment measures 
for the youth group are needed and if they can be made successful, the offense 
rates of later age groups should in the course of time show considerable de- 
cline.” The book is a useful compilation of statistical information from 
numerous sources, some of which are not easily accessible to American stu- 
dents. The statistics if subjected to more rigorous analysis may prove to be 
more instructive though probably inconclusive as guides to criminological 
practices and procedures. 

BarRKEV 8. SANDERS 

Social Security Board 





National Unity and Disunity, The Nation as a Bio-social Organism, by George 
Kingsley Zipf. Bloomington, Indiana: The Principia Press, Inc. 1941. xv, 
408 pp. $3.50. 


This is a very important book. The main title unfortunately suggests that 
it may be just another of the effusions to which the academic fraternity as 
well as journalists recently have devoted themselves on this subject. The 
sub-title, which is more truly descriptive of the content, is more reassuring. 
The very first page of the Preface strikes the true keynote: 

“Some time ago it occurred to the author that we might learn much about 
our various social, economic and political problems if, instead of viewing man 
as ‘God’s noblest creation,’ we studied human group-behavior with the same 
ruthless objectivity with which a biologist might study the organized activity 


1 Sanders, Barkev S., “A Comparative Study of Certain Characteristics of 1,000 Inmates of the 
Northeastern Penitentiary,” Reprint No. 1745 from the Public Health Reports, Vol. 51, No. 19, May 8, 
1936, pp. 571-591. 
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of an ant hill, or of a bee-hive, or of a colony of termites. In other words, if 
we viewed man as but just another case of social organization in Nature’s 
balance, we might hope to discern something of the fundamental drives that 
govern our behavior, regardless of whether that behavior happened to strike 
us as being particularly ‘noble’ or ‘ignoble.’ And with this thought in mind, 
the author began an investigation which, thanks to the help and encourage- 
ment of friends, prospered rapidly, as the firm outlines of some very precise 
and yet bafflingly simple social laws emerged ever more clearly from the ac- 
cumulating data.” 

As the briefest possible illustr: ‘ion of the type of law referred to, the fol- 
lowing will have to suffice: 

“If the reader will consult the census of the United States for 1930 he will 
note a very curious relationship between all the communities that contain at 
least 2,500 inhabitants. Thus he will find that New York was first in size of 
population; that the second largest city had } as many inhabitants as New 
York; that the third largest city had } as many inhabitants as New York; 
that the fourth had } as many; the fifth }—indeed that the nth largest com- 
munity had 1/n as many inhabitants as New York. This relationship be- 
tween the size and rank of our communities in 1930 was quite precise and, 
we might add, is one that is eminently simple to express in mathematical 
terms: 1, 3, 3,3, 4,°°°,1/n. 

“If another example is needed of the type of empiric law that emerged 
from our data, we might add that the incomes of the United States in 1929 
were distributed to individuals, corporations and other groups according to 
a relationship between size and rank that is both quite precise and quite 
simple to express in mathematical terms. For, as far as the data go, the rela- 
tionship between comparative size and rank is again: 1, 4, 4, 3, 3,- °°, 1/n.” 
The major part of the book consists of data drawn from various sources to 
demonstrate this thesis. 

Space does not permit us here to review the adequacy of the data sub- 
mitted or to consider critically their applications to the present national and 
world situation. Dr. Zipf, who is University Lecturer at Harvard, makes 
vigorous applications of this sort, and some of his conclusions will be highly 
displeasing to large numbers of his colleagues both at Harvard and elsewhere. 
As to the method of approach, many of the criticisms brought against the 
late Raymond Pearl’s law of population growth and against the J-curve 
hypothesis recently advanced by social psychologists, will doubtless be 
brought against Dr. Zipf’s methods. Zipf’s work also will remind many of 
the neo-organismic theories of Corrado Gini. 

Without going into details and reservations, the present reviewer is of 
the opinion that with the return of relative sanity in world outlook and a 
more sober consideration of the fundamental problems of social science, Dr. 
Zipf’s analysis will be regarded, on the whole, as a distinguished contribu- 
tion. In any event there is no doubt that this is the type of analysis to which 
social scientists should devote more attention. The bovk is, for the most 
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part, non-technical and will be found delightful and highly profitable reading 
by any person seriously interested in the social sciences. 
GeorGE A. LUNDBERG 


Bennington College 


The Art and Technique of Administration in German Ministries, by Arnold 
Brecht and Comstock Glaser. Cambridge: Harvard University Press. 
1940. xiv, 191 pp. $2.00. 


How do the wheels of German bureaucracy go round? The present vol- 
ume, after sketching the organization of national departments (ministries, 
independent bureaus, and subordinate bureaus attached to ministries) de- 
scribes the record and filing system of a German ministry. Collaboration of 
the higher officials in a German ministry so that the department head (the 
Minister) is properly informed through the permanent Secretary, the divi- 
sion directors and principals of any matter that should be brought to his at- 
tention, is interestingly discussed in a separate chapter. 

An annotated translation of the General Code of Administrative Proce- 
dure in the German Reich Ministries, of September 2, 1926, comprises Part 
II of the present book, whose senior author, Dr. Brecht, was the initiator 
and main writer of the General Code. Dr. Brecht, now a member of the 
Graduate Faculty of the New School for Social Research in New York, was 
chief of the Division for Constitution, Administration and Civil Service in 
the Reich Ministry of the Interior in the ’twenties. Changes in technical 
details of the Code under the Nazi regime have been surprisingly few, al- 
though the changes in the spirit and psychology of German bureaucracy 
under the Nazi dictatorship have doubtless been very considerable. 

While the General Administrative Code (comprising 112 sections) covers 
a wide field of administrative procedure in detail, it does not apply to budget- 
ary and financial matters nor to personnel problems—recruitment, training, 
promotion, transfer, removal, and pay. Almost half of the Code is devoted to 
“Dealing with the Contents of Incoming Letters,” but is broader in scope and 
detail than the title might indicate. Other chapters of the Code relate to 
Cabinet matters, house regulations (office hours, office space and supplies, 
telephones, and leaves of absence), library and publications, and official 
contacts outside the ministry. A set of general forms (in translation) pre- 
scribed for a Reich ministry are part of an exhibit to the General Aciministra- 
tive Code, of which appendices A and B comprise a filing room code and a 
copying office code, both in turn with model forms as exhibits. 

Space does not permit of a summary of the book’s somewhat succinct 
Chapter IX, “Significance [of the German Administrative Code] for the 
United States.” 

Harry L. FRANKLIN 

Office of Foreign Agricultural Relations 

U. 8. Department of Agriculture 














-Boox REVIEWS 443 


The Conditions of Economic Progress, by Colin Clark. London: The Macmil- 
lan Company. 1940. 504 pp. $5.00. 

In this book, Colin Clark, the brilliant Australian statistician, now director 
of the Queensland Bureau of Industry, brings together and synthesizes an 
enormous range of studies on the national incomes ui over thirty nations; 
how these incomes are distributed, and what are the factors which go to 
determine them. Perhaps his most original contribution is to compute the 
comparative real incomes per worker of the various nations in terms of inter- 
national units of constant purchasing power. This is done by comparing aver- 
age incomes in a given country in terms of the average American income and 
in turn multiplying this ratio by the ratio of the cost of American goods at 
the prices of the given country to the cost of the goods of the given country 
at American prices. 

Aided by the studies of the International Labour Office on international 
differences in the cost of living, Mr. Clark finds that for the decade 1925-34 
the average real income per worker in the United States was $1,381; Canada’s 
average income is computed as $1,337; New Zealand’s as $1,202; while the 
average for Great Britain is found to be $1,069, and for Switzerland $1,018, 
and Australia $980. It is interesting to note that all but one of these are 
English-speaking countries. Then follows a group of countries, namely, 
Holland, Ireland, France, Denmark, Sweden, Germany, and,Belgium, where 
the average annual income per worker ranges between $600 and $855. These 
are the countries of western Europe. Then after Norway come the countries 
of Central Europe, Austria and Czecho-Slovakia, while near the bottom with 
average incomes per worker of from $320 to $355, or less than one-fourth the 
real American average, are Italy, Japan, Soviet Russia, Poland, and most 
of the former east Baltic states. The average income of India is estimated at 
approximately $200 per worker, while that of China is given as around $120, 
or only about one-twelfth the American average. 

As far as rates of growth in these incomes over the last 85 years are con- 
cerned, there is marked similarity between those for the United States and 
western Europe except that both Norway and Sweden show steeper growth 
curves than those for any other country. Mr. Clark points out that there is a 
universal tendency for the manufacturing occupations to grow more rapidly 
than the agricultural and for the trade and service industries to grow more 
swiftly than either, and he correctly ascribes these phenomena to differences 
in the income elast/ -ities of the various groups of commodities and services. 

I have found the chapter on the role of capital to be particularly stimulat- 
ing in view of the way in which Mr. Clark has utilized the production func- 
tion developed by the reviewer and the interesting conclusions which he has 
drawn from it. 

I have been able to find only two inconsistencies or patent errors in his 
book. The first is where (p. 159) the conclusion is drawn that the American 
standard of life was lower in 1929 than in 1916, and the second is where it is 
stated (p. 245) that “inefficient agricultural methods seem to continue in the 
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United States.” This may be true of cotton, but it is scarcely true of any 
other crop. 

But these are but minor flaws in a truly herculean task. Mr. Clark has 
accomplished a tour-de-force of statistical economics which is of great prac- 
tical value. He has opened the way for theoretical work which should seek 
to explain the differences he has revealed, and account for the similarities. 

Pau. H. DovuGias 


The University of Chicago 


Statistics of Income Supplement Compiled from Federal Income Tax Returns 
of Individuals for the Income Years 1984 and 1936. Washington: Depart- 
ment of the Treasury in cooperation with the Work Projects Administra- 
tion. June 1940. 1934, 86 pp. 1936, 308 pp. 


In 1938, the Division of Research and Statistics of the United States 
Treasury Department in cooperation with the Wor’:s Progress Administra- 
tion published Statistics of Income Supplement, Section I, making available 
valuable additional] data compiled from individual income tax returns for 
1934. That work has been continued with the publication in 1940 of Section 
II of the Supplement for 1934 and of Sections I and II and III of Statistics 
of Income Supplement Compiled from Income Tax Returns for 1936. 

Prepared under the guidance of Dr. Roy Blough, Director cf Tax Research 
of the United States Treasury, these new volumes make available tabular 
data covering a wide range of subject matter relating to individual incomes, 
their magnitude, composition, and geographic distribution. The Division of 
Tax Research in making the study has adopted and consistently held to a 
high standard of statistical excellence. In economy of space and logic of or- 
ganization, in conciseness and clarity of titles, definitions, and footnotes, the 
publication is a model which, it may be hoped, others will use. The compara- 
bility of the data with figures for corresponding years in the Statistics of 
Income prepared by the Bureau of Internal Revenue is discussed in detail. 

Only a few of the many interesting and valuable classifications of income 
can be mentioned. Section I of the 1936 Supplement contains tables showing 
the geographical distribution of net income, deductions, and selected items 
of income or loss by size of net income for states, large metropolitan areas, 
and individual cities of 100,000 and over. While Section I of the Supplement 
for 1934 gave the number of individual returns by net income classes for all 
counties in the United States and for cities of 25,000 and over, the omission 
of this data in the 1936 publication is more than counterbalanced by the 
wealth of detail given for the states and large cities. Of special interest to 
those engaged in studies of national income are individual returns for the 
nation and for each smaller area, classified according to net income excluding 
capital net gains and losses. 

Section II, of principal interest to the tax administrator, provides infor- 
mation taken from matched returns of husbands and wives making separate 
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returns. The tables in Section III set forth what the Division of Tax Re- 
search describes as “patterns of income.” Returns showing but one source of 
income are distributed by size of total income and by source while succeeding 
tables present corresponding information concerning incomes derived from 
various combinations of two or more sources. Correlation tables show the 
relation between the amounts of income from different sources, for example, 
between income from rents and that from interest. 

The foregoing description of these statistical compilations necessarily gives 
a completely inadequate idea of their comprehensiveness and importance. 
Economists and tax experts will find in them much of interest especially if 
their studies fall in the fields of tax administration, fiscal policy, or the in- 
come structure of the local economy. 

Donatp W. GILBERT 
The University of Rochester 


Savings, Investment, and National Income, by Oscar L. Altman. Temporary 
National Economic Committee. Monograph No. 37. Washington: Gov- 
ernment Printing Office. 1941. 135 pp. 


The material collected in the TNEC hearings that were devoted to the 
more general aspects of savings and investments is summarized, and in some 
directions extended, in this monograph. The author discusses only super- 
ficially the causal relations between savings, investment, and national in- 
come, and presents conclusions concerning economic policy that, at best, 
find no adequate basis in the monograph. The author’s real contribution is 
his compilation of statistics (many of which have only recently become 
available) on the aggregate magnitude of savings and investments in the 
United States during the last two decades, and on components of these ag- 
gregates; and of information concerning some aspects of their institutional 
background. The components are classified in various ways: for example, 
savings are given by type of claim and by size of company; and investments 
are broken down by industry and by kind of commodity. The report is, 
indeed, a fairly comprehensive collection of American statistics on both sav- 
ing and investment. It is all the more unfortunate, therefore, that it is marred 
by occasional faults: a lack of adequate discussion of concepts and terms, 
which is needed especially because diverse sources are cited; rather careless 
citation of statistics, as in Table 11, in which the 1923-24 figures given for the 
sum of depreciation and depletion really relate to depreciation alone, al- 
though the source cited gives the correct total for these years; juxtaposition 
of series that are only roughly comparable with one another (sometimes these 
are series that could have been made more comparable on the basis of data 
given in the sources cited), with insufficient warning as to the incomparabili- 
ties; inadequate explanation when explanation is vital, as in Appendix I, 
footnote 2; ambiguous statement, as on p. 58, where the author writes that 
the “volume of gross savings .. . states how much must be spent on other 
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than consumption goods if the level of national income is to be maintained;” 
and even confusion, as in the discussion concerning inventory revaluations 
(p. 21 n). 

Despite these blemishes, however, the monograph will leave with the 
reader a vivid impression of the quantitative importance of saving and in- 
vestment in our economy, and a notion of the varied forms in which they 
make their appearance. 

SOLOMON FABRICANT 

National Bureau of Economic Research 


The Output of Manufacturing Industries, 1899-1937, by Solomon Fabricant 
with the assistance of Julius Shiskin. New York: National Bureau of Eco- 
nomic Research, Inc. 1940. xxiii, 685 pp. $4.50. 


Has the United States exhausted its capacity for industrial expansion? 
Can we consider the period from 1929 to 1937 when the output of manufac- 
turing industries failed to advance appreciably as a period of definite retro- 
gression? The negative answer to both of these questions is given by Solomon 
Fabricant in his very well written book, presenting results of the extensive 
study of manufacturing output from 1899 to 1937. 

Dr. Fabricant says that the forces making for growth in our econony were 
not dormant in what is widely viewed as a period of stagnation, and supports 
his statement by the detailed analysis of changes in output of 132 industries, 
calling to our attention the fact that the output of about half of the industries 
advanced during the years 1929-1937 in some instances by substantial 
amounts (from 9 to 15 per cent in the cases of paper, hosiery, woolen gocds, 
footwear, leather, paints, ice cream, and confectionery up to increases of 
over 200 per cent for a list including refrigerators and rayon). 

As to the complete period between 1899 and 1937, covered by the study, 
we might say that the outstanding feature of all of the National Bureau of 
Economic Research index numbers, constructed by Solomon Fabricant, is 
their indication of a rate of advance in the output of all manufacturing indus- 
tries in the United States higher than the rates shown by other studies of 
manufacturing output. The only other comprehensive index covering this 
period, that computed by E. E. Day and Woodlief Thomas and extended by 
other investigators, indicates a rise of 203 per cent between 1899 and 1937, 
whereas the National Bureau index increases by 276 per cent. The two in- 
dexes run parallel to each other between 1899 and 1909, but then begin to 
diverge. From 1909 to 1937 the National Bureau index rises by 140 per cent, 
the Day-Thomas by only 90 per cent. The latter appears to be more sensitive 
to most of the cyclical movements than the National Bureau index: it climbs 
less rapidly between 1909 and 1914, and more rapidly between 1914 and 
1919; it falls and rises more precipitately between 1919 and 1923; between 
1925 and 1927 it declines slightly, whereas the National Bureau index moves 
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upward; and between 1929 and 1933 it declines more sharply than the Na- 
tional Bureau index. Between 1927 and 1929 and between 1933 and 1937, 
the Day-Thomas index rises less rapidly than the National Bureau index. 

The dissimilarities between the National Bureau index and the Day- 
Thomas index originate in differences of construction and coverage. The 
index numbers presented in this study are based on a greater number of in- 
dustries: for 1899-1909, 53 industries; for 1909-1919, 65; for 1919-1929, 74; 
and for 1929-1937, 132 industries; while the basis of the Day-Thomas index 
is for 1899-1909, 26 industries; for 1909-1914, 27; for 1914-1919, 28; and 
for 1919-1935, 49 industries. The essential point is that industries omitted 
in the Day-Thomas index are the new and rising industries such as rayon 
and rayon goods, refrigerators, washing machines, radios, machine tools, and 
new chemical products. 

The old Federal Reserve index, not revised until August 1940 parallels 
closely the Day-Thomas index for 1919-1933 so that it too rises less rapidly 
than the National Bureau index. From 1919 to 1937 the Nat.onal Bureau 
index goes up 69 per cent as compared with 30 for the unrevised Federal 
Reserve index. From 1929 to 1933 the old Federal Reserve index moves 
downward similarly to the National Bureau index, but rises less rapidly from 
1933 to 1937. From 1929 to 1937 the National Bureau index records a net 
gain of 3 per cent whereas the old Federal Reserve index drops 9 or 10 per 
cent. 

The new Federal Reserve index, completely revised for the period begin- 
ning with 1923, covers many industries not previously included, and takes 
account of the output of industries for which monthly data on production 
are not available by utilizing biennial or annual indexes of output computed 
in the study under review and other studies and monthly indexes of man 
hours of employment. It shows a rate of increase substantially higher than 
that indicated by the old Federal Reserve index, and is much nearer to the 
National Bureau index. 

Solomon Fabricant did not limit his study by construction of index num- 
bers presented in this volume. A number of alternative indexes constructed 
by methods different in certain technical respects from the procedure fol- 
lowed in the preparation of the index presented in this volume were con- 
structed showing the rises ranging from 238 per cent to 381 per cent for the 
period 1899 to 1937. Dr. Fabricant admits that in view of the nature of the 
data and the length of the period covered, these differences with the index 
published are not large. The writer of this review believes that, as the new 
Federal Reserve index underwent substantial changes because of covering 
many new industries not previously included, it might be expected that in a 
similar way the Day-Thomas index would show a higher rate of increase if 
it were constructed with the new industries included. It appears that the 
series included in the indexes are more responsible for the magnitude of final 
measures rather than the mathematical formulas used in construction of 
index numbers. 
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The material in the book is arranged in such a practical way that either 
42, or 321, or 661 pages might be read. The first-chapter of the book contains 
a summary of the changes in manufacturing output during the period under 
discussion, together with a brief consideration of the general economic sig- 
nificance of these events. Chapter 2 discusses the problem of measurement 
of physical output and describes the method of computation of the indexes. 
Chapter 3 describes changes in the total manufacturing output; and Chapter 
4 discusses the same problem in relation to major groups. Chapter 5 covers 
trends in the output of individual manufacturing industries. Part Two traces 
in greater detail the course of output for individual industries and for the 
groups into which they have been classified. Here, in addition to indexes for 
all the separate industries, the changing relative standing of each industry 
in its respective group is described. Technical notes and detailed index num- 
bers are given in appendices to the book. 

The preface to the book, giving complete, brief, clear, and illuminating 
summary of the study is written according to the best traditions of the Fred- 
erick C. Mills school. 

V. S. KoLEsNIKOFF 


U.S. Bureau of the Budget 


The Newsprint Paper Industry: An Economic Analysis, by John A. Guthrie 
Cambridge: Harvard University Press. 1941. xi, 274 pp. $3.50. 


This work comprises another valuable link in our lengthening series of 
studies of the economics of particular industries. Application of theories of 
competition, of costs, of location, of economic development, etc. ta specific 
industrial situations not only is of much value to the theorist in providing 
working illustrations and supplying needed building blocks, but, if the the- 
ories are at all useful, the description and explanation of the industrial facts 
may be given improved organization and greater meaning. Mr. Guthrie 
chooses an extremely interesting but difficult industry for his attempted inte- 
gration of description and theory. The newsprint industry is one in which 
imperfections of competition clearly exist, due to the small number of sellers 
and their geographical separation, to financial tie-ups among producets and 
between producers and consumers, to a history of tacit and overt collective 
action among sellers, and to customary methods of price quotation, forms of 
contract, and lines of trade. Likewise it is an industry in which complex prob- 
lems of locational analysis inevitably arise, as well as problems of monop- 
sonistic wage and rent determination. 

The chief theoretical defect in Mr. Guthrie’s book, to the present reviewer, 
is his failure to pursue further or even to analyze clearly the relations of 
prices to economic rents—in this case stumpage costs. This failure impairs 
the value of the author’s entire analysis of interregional competition. He 
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finds that “wood costs . . . undoubtedly are the chief determinant of regional 
advantage,” and agrees from time to time that wood costs “contain some 
element of imputed value or economic rent.” Yet this point is given entirely 
insufficient discussion, and the brief attempt to measure the amount of such 
rent is very inadequate. Mr. Guthrie attempts to eliminate the economic 
rent by taking wood cost figures for depression years, when, he says, the 
economic rent would have all been “squeezed out.” But obviously in these 
years the relative locational rent differentials are not removed; merely all 
are decreased. Unfortunately the crucial Table 13, which apparently was to 
have shown the differing amounts of reduction in wood costs in depression 
years among the various areas fails to appear in the text. 

The author fails throughout to give clear-cut appreciation of the extent 
of the dependence of costs (in the form of rents) upon prices. He frequently 
mentions this dependence, yet his treatment fails clearly to distinguish these 
elements. This failure is likewise reflected in the confusion about transporta- 
tion costs, which are first treated as one of the coordinate cost elements, and 
then belatedly recognized as dependent upon the others: other costs deter- 
mine how far the product can be shipped, and this determines the average 
freight cost per ton of actual shipments. (The statement, however, is hardly 
correct that “theoretically we should expect that . . . a widening of the ra- 
dius of distribution in low cost areas would make cost at the mill plus trans- 
portation equal in all regions.”) The difficulty, of course is the familiar one 
of exposition of situations of price interdependence and mutual causation. 
But Mr. Guthrie’s difficulties seem at some points to be more than exposi- 
tional. 

At least two interesting theoretical issues are raised by the discussion which 
could well be further pursued. One is the question of rent determination 
under conditions of imperfect competition, here strikingly exhibited. Another 
is the monopsonistic positions of the producers as purchasers of labor, and 
the rent element in labor incomes. In many cases the newsprint mills are the 
sole important employers of a rather immobile labor supply. 

A large amount of statistical data has been gathered by the author, much 
of it from private sources and here published for the first time. The compila- 
tion of other published statistical information with regard to the newsprint 
industry makes the study relatively well documented, although the data are 
frequently not entirely comparable. A few of the tables and charts are not 
adequately labelled (Chart 10 omits labelling of one axis; Table E is titled 
“labor costs” instead of “wage rates”; Table 18 misuses the term “increase” ; 
Table 23 seems to add cost figures per cord of wood and per ton of pulp; 
the unit is omitted from Table 24; etc.). Examples of other minor lapses are 
distressingly frequent. There are unexplainable kinks in the upper curve of 
Chart 2. Average deviations from arithmetic means are compared on p. 138, 
when it is not specifically noted that the means are significantly different. 
On pp. 123-4, a demand curve of an irreversible sort is postulated which 















450 AMERICAN STATISTICAL ASSOCIATION: 


seems to have little meaning; it implies that after a sufficient series of price 


fluctuations had taken place there would be no demand left. 
GARDNER ACKLEY 





University of Michigan 


Fiscal Capacity of the States, A Source Book. Washington: Social Security 

Board. Third Edition, Revised 1940. vii, 66, 406 pp. 

It is well recognized that the states differ widely both in aggregate econo- 
mic strength and in resources and productivity per capita or per square 
mile. Since various classes of business differ in ability to migrate under tax 
pressure, relative fiscal capacity probably is not measured with accuracy by 
relative strength in terms of resources and production. Since the states may 
differ considerably in political capacity, state governmental performance 
(with local government considered) may fail to reflect accurately the com- 
parative strength of the states either in outright resources and productivity 
or in fiscal capacity. 

Many of the states have had long and varied experience in allotting funds 
to their local governmental units for purposes of “equalization,” particularly 
in public education. Although such aid may do little to “equalize” conditions, 
since the districts fortunate enough to have both wealth and progressive out- 
look may raise their standards as much as do the poorer ones, at least the aid 
tends to raise minimum standards. And since raising minimum standards is 
an underlying aim in such programs, their planning necessitates examination 
of local needs and of local fiscal capacity. 

Federal grants-in-aid to the states (except emergency grants of recent 
years) have been mainly on a dollar-matching basis. But on such basis, the 
weaker states may fail to do much dollar-matching, and thus fail to receive 
mch aid, or they may undergo too much financial strain in matching Federal 
dollars. There is a sustained and increasing plea for Federal funds on a basis 
in which state inequalities of fiscal capacity and of need are recognized. And 
there is, consequently, a sustained and increasing urge to measure the fiscal 
capacity of the states. 

The third edition of Fiscal Capacity of the States has been prepared by the 
Bureau of Research and Statistics of the Social Security Board. Data have 
been extended through 1939 as far as possible, although some series could 
not be brought beyond 1938. In a 51-page introduction, the character of the 
materials, the methods of estimate, and the broad meaning of the tables are 
explained. The reader is cautioned in a number of instances to be careful in 
the use of estimates subject to a substantial margin of error. The material 
was not submitted to the Board for official approval, and the supply of the 
publication is not scaled for general circulation. But the care in its prepara- 
tion will serve both to give confidence in tentative use and to spur further 
analyses which may broaden measurement and reduce margins of error. 

J. P. Watson 


University of Pittsburgh 
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Wage Differentiais: A Study of Wage Rates in Philadelphia Metal Plants, by 
C. Canby Balderston. Philadelphia: Wharton School of Finance and Com- 
merce, University of Pennsylvania. 1939. 39 pp. $1.00. 


Successful wage administration within a plant clearly depends in part 
upon knowledge of the occupational wage structure in similar plants in the 
same labor market, or, more generally, upon knowledge of market wages for 
workers with the skills required by the enterprise. Some information of this 
nature is almost always possessed by plant officials concerned with wage 
policy; the information, however, may be derived from limited observation 
and may not serve adequately as a guide to policy. 

This brief monograph analyzes wages in a group of metal working plants 
in the Philadelphia area for the purpose of determining the occupations for 
which skill differentials tend to be the same, those for which little or no con- 
sistency in skill differentials can be discerned, variations in general wage 
levels among firms, and the effect of incentive payment plans upon the 
amount of employee earnings and upon skill differentials. Mr. Balderston 
points to various uses to which the findings of such a study can be put in the 
practice of the art of wage setting. 

In general, this study attempts to do too much in too little space. Its use- 
fulness is impaired by a lack of clarity that springs at least partly from 
brevity. The underlying data are not set forth, and hence the study has an 
air of excessive abstractness. 

H. M. Dovuty 

Wage and Hour Division 

U. S. Department of Labor 


Geographical Differentials in Prices of Building Materials, by Walter G. Keim. 
U. 8. Department of Labor, Washington: Government Printing Office. 
Monograph No. 33, Investigation of Concentration of Economic Power, 
prepared for use of the Temporary National Economic Committee. 1940. 
xxii, 459 pp. 


This monograph, a study of 37 materials important in residential construc- 
tion, presents data for each of 50 cities located in every state of the United 
States and in the District of Columbia. In addition to brief statements con- 
cerning value of product in selected years, relative importance of producing 
areas, and price structures for each of the 37 materials, extensive data on 
wholesale and retail prices are presented. Monthly indexes of price change 
are shown—for each of 9 regions and for the country as a whole—covering 
the period January, 1935, through September, 1939, together with actual 
prices in the different localities for September, 1939, for selected commodi- 
ties. This is done for both wholesale and retail prices. 

The procedure followed in the collection and analysis of price data may 
be described briefly as follows: the price data selected for a given commodity 
in a given locality were those “furnished by a dealer who was a representative 
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seller of the commodity”; they were not averages of prices furnished by a 
number of sellers. In the derivation of indexes of price change, weights were 
employed based upon “the total dollar volume of new residential building 
for which permits were issued” in the respective cities during the period 1937- 
1939. This was done for both the regional and country-wide composites. 
Simple arithmetic means of the September, 1939, wholesale and retail prices 
in the cities of a given region were used to measure the average spread be- 
tween such prices in the region. 

The validity of the procedure described above may be questioned on two 
counts. In the first place, the weights used to derive the country-wide in- 
dexes do not appear to be appropriate; too much weight is given to some re- 
gions and not enough to others. Thus, because of the presence of New York 
and Philadelphia, together with Trenton, in the Middle Atlantic region, this 
region alone is given a weight of 44 per cent of the total for all regions. In 
the case of plaster, for example, the Middle Atlantic is the only region which 
shows a drop in retail price from August to September, 1939, and the corre- 
sponding decline in the country-wide index (p. 54) is 3.8 per cent. The use of 
weights more nearly representative of the respective regions—based upon 
more comprehensive coverage than that of the 50 cities studied—would ap- 
pear to be more appropriate. 

In the second place, the use of simple arithmetic means to measure average 
spreads between wholesale and retail prices, by regions, is unsatisfactory. In 
the case of oak flooring, for example, the average spread shown between 
such prices (p. 241) is 20 per cent of the wholesale price for both the East 
North Central and the Pacific regions. When weighted arithmetic means are 
employed, using the weights indicated for the respective cities, the spreads 
are 22 per cent of the wholesale price for the East North Central region and 
only 10 per cent for the Pacific region. For the Rocky Mountain region, the 
figure shown is 51 per cent, whereas the spread based upon weighted arith- 
metic means is 42 per cent and for Denver, the largest city in the region, it is 
31 per cent. 

The monograph states that it is planned to “publish current data regard- 
ing these building material prices in the future on the same basis on which 
they are presented here.” This report, presenting such information for the 
first time, together with similar current data in the future, should—with 
modification perhaps to meet the above criticism—be of great value to the 
industry. 

F. L. CARMICHAEL 


University of Denver 


Residential Real Estate, by David L. Wickens. New York: National Bureau 
of Economic Research. 1941. xxii, 305 pp. 


This book consists primarily of statistical tables containing both original 
and compiled materials. The first 68 pages contain an interpretative intro- 
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duction and a detailed explanation of statistical methods used in the prepara- 
tion of original materials and in the utilization of those previously published. 

The interpretative observations are penetrating and serve to emphasize 
the salient conclusions arrived at after study of the results obtained by the 
painstaking procedures described. The remaining 229 pages present these 
results in tables, census-like in both their form and scope, and covering a 
number of aspects of non-farm residential real estate that have never before 
been presented either in as much detail or on as comprehensive a basis. 

The principal sources drawn upon are the Financial Survey of Urban 
Housing, the “eal Property Inventory, familiar Census tables, and the build- 
ing permits series of the Bureau of Labor Statistics. All these materials are 
skillfully used (through frequently laborious methods devised for circum- 
venting conspicuous gaps or deficiencies of data) in arriving at a number of 
statistical series of major significance. These include the following estimates, 
several of which are given by population size-groups, by major geographical 
areas, and by states: number, value, and average value of dwelling units, by 
tenure as of April 1, 1930, and January 1, 1934; the number and value of 
owner-occupied one-family dwellings by value groups in 61 cities as of the 
same dates; the number and percentage distribution of dwelling units by 
age groups for 64 cities as of January 1, 1934; and severa! other percentage 
distribution tables prepared primarily from Real Property Inventory and 
Financial Survey data. It is to be noted that these materials have not before 
been summarized for the cities covered nor have estimates been prepared 
from them. The basic tables for the Real Property Inventory were published 
city by city, and the Financial Survey tables for only 22 of the 52 cities 
covered. 

Section “C” on family income for 1929 and 1933 represents the distribu- 
tion of family income by income groups for 213,522 families. The data for 
these tables were secured either by personal enumeration or by mail in the 
conduct of the Financial Survey. Several of these tables represent special 
tabulations and give new materials on such important relationships as that 
between rent and income. The tables are given for all cities combined and 
for each city separately. 

In Section “D,” total mortgage indebtedness on non-farm residential prop- 
erty in 1934 is estimated at $26,078,684,000. This estimate is supported by 
detailed tables in which data are presented by tenure, by value groups, by 
geographical division and by cities, as well as by type of dwelling. Additional 
tables indicate financial terms and provisions of mortgages both first and 
junior, as of 1934, and are based primarily on the Financial Survey, with one 
table giving details on residential real estate financing in New York’s Lower 
East Side and Harlem. 

Estimates of the volume of residential construction from 1920 to 1936 
occupy the remaining pages. These estimates are carefully made, and follow 
in general the trends indicated by estimates previously published. But they 
are given in greater detail, by major geographical divisions and by a helpful 
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segregation with respect to degree of urbanization of the area in which build- 
ing occurred. 

The discussion of methods and of limitations of data and estimates is frank 
and disarming. For what they are worth, the tables are presented with all 
their “imperfections on their head.” For years to come they will constitute 
one of a very small but growing number of sources of data to which the 
student of land economics, housing, and home financing will go. 

The task of preparing such a volume from the scanty, scattered, and dis- 
crete materials available was a monumental one which has been accom- 
plished with credit to those who participated in it. 

ERNEsT M. FISHER 


The American Bankers Association 


Sales Control by Quantitative Methods, by R. Parker Eastwood. New York: 
Columbia University Press. 1940. xviii, 311 pp. $3.50. 


In this book Professor Eastwood has departed from convention in a num- 
ber of ways. Selected analytical tools of the accountant, statistician and 
market research analyst have been brought together in an effort to discover, 
define, and present effectively a mechanism for controlling sales operations. 

That this is a task of considerable magnitude is at once apparent and this 
fact forms the basis for the two principal criticisms of this book. In writing for 
a wide audience the book is necessarily elementary and nontechnical in its 
approach and the space which can be devoted to each of the analytical tools 
is so limited that effectiveness of presentation is somewhat impaired. The 
trained statistician will find little that is new to him in this work, although 
Professor Eastwood does provide a quick overall view of the problem which 
may stimulate further work in more advanced texts in the fields of account- 
ing and market research. 

Professor Eastwood states in the preface that the general objective of this 
study is “to establish a body of methods by which sales may be more effec- 
tively estimated and controlled.” The first half of the book is devoted to 
ratio and functional analysis as measures of past sales efforts and an outline 
of the necessity for taking short, medium, and long-term economic forces 
into account in predicting future sales performance. A few pages on mathe- 
matical curve fitting seem out of place in a book of this type, both because 
justice cannot be done to the subject in so brief a space and because this type 
of analysis, popular in older texts, has been replaced by more powerful sta- 
tistical tools during the past decade. 

The second half of the book is devoted to the presentation of material and 
methods normally coming within the scope of what is usually termed market- 
ing research. Chapters on collection of data, sampling theory and practice, 
tests of significance, the analysis and interpretation of data, and market in- 
dexes present a good overall view of problems connected with the marketing 
aspects of sales control. It would not be possible, however, and presumably 
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the author did not intend, that the student could make effective use of many 
formulas dealing with sampling procedure which are found in the book 
without further study in texts dealing with statistical methods. 

The chapter on theory of sampling is particularly weak in its ¢iscussion 
of the arrangement of universes for sampling purposes. The current confusion 
in the literature concerning the terms homogeneity, purposive sampling, 
random sampling, and stratification are hardly clarified by this chapter. The 
type of sampling employed by the Consumer Purchase Study in 1935-36 was 
declared to be “randomness supplementing purposiveness to form mixed 
sampling of the stratified variety.” 

A number of illustrations of surveys taken in the field by various marketing 
research organizations are used. An excellent list of classified references deal- 
ing with various phases of budgetary control and market research methods 
is included. One section of the appendix adequately summarizes the principal 
considerations to be watched in obtaining market information by means of 
mail questionnaires. 

ALFRED N. WaTSON 


The Curtis Publishing Company 


Land Tenure Policies at Home and Abroad, by Henry William Spiegel. Chapel 
Hill, North Carolina: University of North Carolina Press. 1941. xii, 171 
pp. $3.00. 


Land tenure policies adopted by administrative units can be said to have 
one or more of “three fundamental objectives: conservation, ‘adjustment’ 
for agriculture as a whole, and help for the ‘disadvantaged classes.’ ” Analy- 
sis from the economic point of view must encompass such considerations as 
the difference between marginal social net product and marginal private net 
product, the nature of the adjustment process in agriculture, and the role 
of investment as a stimulating factor in various segments of the economy. 
Also (though not readily segregated from the economic) there are the essen- 
tially political, sociological, and philosophical considerations which underlie 
each nation’s land-ownership program. 

Though the present study includes some analysis of these objectives along 
these lines, it is principally concerned with an exposition of certain land 
tenure policies adopted in recent years. There is a fairly long treatment of 
tenancy policies in the United States, particularly of the Bankhead-Jones 
Act. Primary attention, however, is paid to aspects of the programs in 
European countries. Indeed, the two final chapters deal exclusively with land 
tenure in England and the Third Reich. Thus a considerable amount of in- 
teresting material not generally known to the American student is made 
available. Undoubtedly, too, the clearly written chapter on the legal back- 
ground of land tenure will bring to many agricultural economists first insight 
into the character given by “the law” to such familiar institutions as owner- 
ship and the various forms of tenancy. 
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Yet, because so much more could have been done with this general analy- 
sis, the book may leave the reader with not much more than a picture of 
these individual policies. Such analysis could have been used as a guide in 
discussing these programs, perhaps as a measure for evaluating them. Indeed, 
this defect is further accentuated by a presentation which suggests a series 
of almost independent essays rather than a well integrated volume. Thus the 
general development of the first two chapters (“Foundations” and “Legal 
Background”) is not brought to bear adequately on what follows. Similarly, 
an important section on the economics of farm tenancy policies first appears 
in the middle of the book, though its considerations are at least as pertinent 
to material preceding it. 

There is some tendency for Professor Spiegel to clinch his arguments by 
presenting data which reflect only in part the phenomena discussed. Thus, 
data showing the high ratio of illegitimate children in Sweden and Saxony 
(p. 149) are the “vicious consequences” of restrictions on equalitarian in- 
heritance in these countries. Again, French “property, especially land, is 
equally divided” while “in England it is customary to prefer a single heir; 
consequently wealth is much less unequally distributed in France than in 
England” (p. 22). 

Too much is also made of the argument that restrictions on equalitarian 
inheritance and the like tend to prevent that divisibility of resources which 
maximizes the national dividend. Certainly the continued operation of many 
farm units which are just “too small” suggests that some legal restrictions 
may contribute to national welfare. Similarly much more must be said to 
justify Professor Spiegel’s belief that the increase in the average size of newly 
established holdings (from about 11 to 19 hectares, 1927 to 1938) shows that 
the Nazi regime is “more interested in ‘crop factories’ than in the improve- 
ment of the social and economic status of the farm workers” (p. 117). 

Despite these defects (and enumeration tends to exaggerate their impor- 
tance) the author is to be commended for a stimulating treatment of his 
subject from a perspective which permits him to enter quite readily into such 
matters as forest land tenure and control, collective action, and credit poli- 
cies. His is a field of current importance, abounding with controversial ques- 
tions which cut across all the social sciences. Professor Spiegel has certainly 
presented a treatment which merits reading and will, as he hoped, “stimulate 
discussion of the issues involved.” 

WILFRED MALENBAUM 


Harvard University 


The International Gold Standard Reinierpreted, 1914-1934, by William Adams 
Brown, Jr. New York: National Bureau of Economic Research, Inc. 
Two Volumes. 1940. xxx, 1420 pp. $12.00. 


A brief review can no more than outline the contribution of a work em- 
bodying more than a decade of study and covering as broad a scope as that 
dealt with by Professor Brown. In form the book contains four parts: the 
first dealing with the breakdown during the last war; the second with the 
restoration up to 1925; the third with the experiment “of attempting to 
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operate an international gold standard with a decentralized financial system 
in an unbalanced world economy”; and the fourth with the disintegration 
after 1931. The study ends in 1934. 

In content, on the other hand, the “central theme is the progressive de- 
centralization of the world’s international credit system and the economic 
changes responsible for it.” The discussion is neither in the “realm of ulti- 
mate generalization and abstraction occupied by the pure theory of inter- 
national trade” nor does it suffer from the limitations of a monograph as 
have so many descriptive studies in the past, for the book is world-wide in 
its scope. Throughout, it is “institutional” in laying stress on the fact that 
the gold standard cannot be understood apart from the system of interna- 
tional credit and finance in which it operates. 

It is a “reinterpretation” because it explains the effectiveness of the pre- 
war gold standard mainly in terms of centralization of the international 
credit system in London and the weakness of the post-war standard mainly 
in terms of the decentralization of the system arising from the emergence of 
the United States as a financial power. As is to be expected from this em- 
phasis, a major portion of the study is devoted to detailed discussion of the 
actual operation of the financial systems involved; but an attempt is always 
made to relate the specific discussion to the broader repercussions which 
basic changes in the economic environment had on the character of the inter- 
national financial system taken as a whole. 

At times this attempt is remarkably successful. Professor Brown’s grasp 
of the many aspects of the problem often enables him to weave the isolated 
parts together into an excellent account of the changes in the system as a 
whole. But there are other times, as is perhaps inevitable with a study of 
this magnitude, in which the flow of the narrative gets lost among the de- 
tails. 

In addition, I cannot help feeling that as a “reinterpretation” the volume 
would have been immensely strengthened if Professor Brown had made 
clearer why a centralized financial system is necessarily stronger than a de- 
centralized system. Had this been done, I cannot help feeling that much of 
the weakness would be seen to result, not from the fact of decentralization, 
but because decentralization was accompanied by a growing disinclination to 
suffer the repercussions on employment that are necessary if the gold stand- 
ard is to operate in the face of rigid prices—in other words, because decen- 
tralization was accompanied by the factors which Professor Brown appar- 
ently feels have been too much stressed in the past. 

Henry H. VILuarp 


Amherst College 


The Volume of Consumer Instalment Credit, 1929-38, by Duncan McC. 
Holthausen in collaboration with Malcolm L. Merriam and Rolf Nugent. 
New York: National Bureau of Economic Research. 1940. xix, 137 pp. 
$1.50. 

The gist of this book, the seventh of a series, is stated in the first sentence 
of the preface: “The study presents annual and monthly estimates of the 
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quality of consumer instalment credit, including both retail instalment (sales 
finance) credit and cash loan instalment credit, for the period 1929-38.” It 
is not, however, quite so tabulated an affair, as this sentence would seem to 
indicate. Interesting side-lights on the business as a whole appear. Attention 
shouid first be called to the fact that these estimates do not cover the entire 
field of consumer finance, but only consumer instalment credit. They do not, 
for example, include open book accounts where there are no contracts calling 
for set payments by instalments. 

Increasingly liberal credit terms have helped greatly to swell the amount 
of outstanding retail instalment credit. Thus, while the amount granted in 
1937 was 15 per cent lower than in 1929, the average outstandings for 1937 
were 7 per cent higher, totalling $2,641,300,000, the largest amount for any 
one of the ten years. The surprising thing is that government agencies en- 
couraged public utilities to sell appliances on contract extending for thirty- 
six months, a much longer period than had been in use. 

Automobile purchases are still the greatest single source of retail instal- 
ment selling. In 1929, they were 55.4 per cent of all such credit and in 1938 
50.9 per cent. About 60 per cent of all automobiles are sold on an instalment 
basis. The cost of household appliances decreased enormously during this 
ten year period affecting greatly the amount of credit granted. Average cash 
loan outstandings were highest in 1938, being more than 100 per cent above 
the 1929 level. 

The increasing interest of commercial banks in the consumer finance field 
can be seen from the fact that of the 1095 banks reporting personal loan de- 
partments in 1938, 80 per cent had established these departments since 
1932. Credit unions have likewise had a great growth. Their average out- 
standings increased 242 per cent from 1929 to 1938. The outstandings of 
personal finance companies increased 50 per cent during this same period. 
The industrial banking companies show little or no increase. Seeriingly, the 
commercial banks, which have much the same loan plan, have taken over 
the additional business which the industrial banking concerns might have 
acquired. The business of the personal finance companies seems the most 
stable of them all, declining less in depression periods and increasing more 
slowly in periods of business expansion. 

Retail instalment indebtedness constitutes about 75 per cent of all con- 
sumer instalment credit indebtedness, but cash loan indebtedness seems to 
be growing at the expense of the retail indebtedness—a trend which in the 
reviewer’s opinion is from an economic point of view a healthy one. 

The student of business statistics will be interested to know that the 
Department of Commerce has provided for the continued preparation of 
these estimates by the Credit Analysis Unit, Marketing Research Division 
of the Bureau of Foreign and Domestic Commerce. Inasmuch as the Russell 
Sage Foundation provided the data on cash lending agencies, this will mean 
a considerable enlargement of the work of the Bureau which had hitherto 


confined itself to data on retail instalment selling. 
Louis N. RosBInson 
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Industrial Banking Companies and Their Credit Practices, by Raymond J. 
Saulnier. New York: National Bureau of Economic Research. 1940. xxi, 
192 pp. $2.00. 


Sales Finance Companies and Their Credit Practices, by Wilbur C. Plummer 
and Ralph A. Young. New York: National Bureau of Economic Research. 
1940. xxiii, 298 pp. $3.00. 


The study of sales finance companies is the second, and the work on 
industrial banking is the fourth, of a series dealing with institutions that 
participate in the instalment financing of consumers. This series was “in- 
augurated in 1938 by the National Bureau of Economic Research as the 
initial phase of a broad program of financial research under grants from the 
Association of Reserve City Bankers and the Rockefeller Foundation.” 

In reading these two volumes many curiosities will be satisfied. Anent 
sales finance companies the student will learn about their history and recent 
development, their legal position and financial structure, relationships with 
dealers and purchasers of goods, credit standards, fairness of charges, profita- 
bility of operation, abuses, effectiveness of competition. Similar matters are 
investigated in reference to Morris plan and related institutions in the vol- 
ume on industrial banking companies. With such a diversity of topics and 
factual data the reviewer can do little except to record his opinion that the 
research has been well done and that the results have been presented clearly 
and intelligently. 

These works should increase the student’s interest in a number of broad 
questions of banking structure and organization that it was no part of the 
authors’ responsibility to consider. In permitting the creation of new lending 
agencies, for instance, what guides should be followed by our supervisory 
and chartering authorities? Should lending agencies be divorced from de- 
posit institutions which would be supported solely by service charges? 
Should rigorous efforts be made to confine credit agencies to the special 
types of operation they were originally intended to conduct? After early 
development should such institutions be permitted to broaden their services 
under the guiding principle of convenience to the public and under the same 
terms as if they had been authorized initially to conduct a general banking 
business? If specialization of lending agencies is to be encouraged which, if 
any, should be administered under branch and chain organizations? How 
should the different agencies be linked together from the standpoint of 
membership in the F.D.I.C. and the Federal Reserve system? 

The longer consideration of the proper integration of lending agencies is 
postponed the more powerful will special interests become and the more 
difficult will ideal solution prove to be. Systematic planning, however, must 
take full advantage of all experience that has been gained. The works that 
are reviewed will be helpful in that they widen our understanding of credit 
practices in fields that have been somewhat neglected by academic writers. 

Harowp L. REED 
™ rnell University 
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